[FFmpeg-devel] [PATCH] avcodec/amfenc: Fix for windows imprecise sleep
Evgeny Pavlov
lucenticus at gmail.com
Wed Oct 18 13:32:43 EEST 2023
On Tue, Oct 17, 2023 at 9:45 PM Kacper Michajlow <kasper93 at gmail.com> wrote:
> On Tue, 17 Oct 2023 at 19:34, Evgeny Pavlov <lucenticus at gmail.com> wrote:
> >
> > The reason for using av_usleep() here is that AMF API doesn’t provide an
> > API for explicit wait. There are two modes to get output from encoder:
> >
> > 1. Polling with some sleep to avoid CPU thrashing – currently used in
> FFmpeg
> >
> > 2. Set timeout parameter on AMF encoder and QueryOutput call will block
> > till output is available or the timeout happens.
> >
> > #2 is the preferable way but it is designed more to be used with a
> separate
> > polling thread. With a single-thread approach in FFmpeg, the use of
> timeout
> > can block input submission making things slower. This is even more
> > pronounced when B-frames are enabled and several inputs are needed to
> produce
> > the first output.
> >
> > The condition of this sleep is in special events (primarily when amf
> input
> > queue is full), not the core loop part. During the experiments the cpu
> > increasing is about 2-4% or so, not a burst.
> >
> > For low resolution encoding, these changes bring significant performance
> > improvement (about 15%). It will not bring improvement for high
> resolution
> > such as 4K.
> >
> >
> > Thanks,
> >
> > Evgeny
> >
> > вт, 17 окт. 2023 г. в 03:26, Zhao Zhili <quinkblack at foxmail.com>:
> >
> > >
> > > > 在 2023年10月17日,上午5:24,Mark Thompson <sw at jkqxz.net> 写道:
> > > >
> > > > On 16/10/2023 10:13, Evgeny Pavlov wrote:
> > > >> This commit reduces the sleep time on Windows to improve AMF
> encoding
> > > >> performance on low resolution input videos.
> > > >> This fix is for Windows only, because sleep() function isn't
> > > >> very accurate on Windows OS.
> > > >> Fix for issue #10622
> > > >> Signed-off-by: Evgeny Pavlov <lucenticus at gmail.com>
> > > >> ---
> > > >> libavcodec/amfenc.c | 4 ++++
> > > >> 1 file changed, 4 insertions(+)
> > > >> diff --git a/libavcodec/amfenc.c b/libavcodec/amfenc.c
> > > >> index 061859f85c..0c95465d6e 100644
> > > >> --- a/libavcodec/amfenc.c
> > > >> +++ b/libavcodec/amfenc.c
> > > >> @@ -770,7 +770,11 @@ int ff_amf_receive_packet(AVCodecContext
> *avctx,
> > > AVPacket *avpkt)
> > > >> if (query_output_data_flag == 0) {
> > > >> if (res_resubmit == AMF_INPUT_FULL ||
> ctx->delayed_drain
> > > || (ctx->eof && res_query != AMF_EOF) || (ctx->hwsurfaces_in_queue >=
> > > ctx->hwsurfaces_in_queue_max)) {
> > > >> block_and_wait = 1;
> > > >> +#ifdef _WIN32
> > > >> + av_usleep(0); //Sleep() is not precise on Windows
> OS.
> > > >> +#else
> > > >> av_usleep(1000);
> > > >> +#endif
> > > >> }
> > > >> }
> > > >> } while (block_and_wait);
> > > >
> > > > Wasting lots of power by spinning on a CPU core does not seem like a
> > > good answer to this problem. (I mean, presumably that is why Windows
> isn't
> > > honouring your request for a short sleep, because it wants timers to
> have
> > > larger gaps to avoid wasting power.)
> > >
> > > If av_usleep is implemented via Sleep like current case, sleep 0 means
> > > yield current thread, so it’s not busy wait in normal case (but it can
> be
> > > busy wait).
> > >
> > > av_usleep(500) may looks better and do the same job by depending
> 500/1000
> > > = 0.
> > >
> > > I agree use sleep without real async is like a bug.
> > >
> > > >
> > > > Why is there a sleep here at all, anyway? An API for hardware
> encoding
> > > should be providing a way for the caller to wait for an outstanding
> > > operation to complete.
> > > >
> > > > Thanks,
> > > >
> > > > - Mark
> > > > _______________________________________________
> > > > ffmpeg-devel mailing list
> > > > ffmpeg-devel at ffmpeg.org
> > > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > > >
> > > > To unsubscribe, visit link above, or email
> > >
> > > _______________________________________________
> > > ffmpeg-devel mailing list
> > > ffmpeg-devel at ffmpeg.org
> > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > >
> > > To unsubscribe, visit link above, or email
> > > ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
> > >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
> Please don't top-post. I'll bottom-post now and no one will know how
> to read this email.
>
> If you need more precise sleep on Windows, your application should use
> timeBeginPeriod/timeEndPeriod API, see
>
> https://learn.microsoft.com/en-us/windows/win32/api/timeapi/nf-timeapi-timebeginperiod
>
> This sleep shouldn't be there to begin with and removing it only for
> Windows, seems like a hacky workaround.
>
> Sleep on Windows is accurate, when you request a timer resolution
> appropriate for your application. You probably don't do that, and have
> unexpectedly long sleeps, but it is not because they are "not
> accurate", it is because you don't ask for it.
>
> Side note, with `Sleep()` you can request only 1 ms sleep, but with
> with waitable timers
> https://learn.microsoft.com/en-us/windows/win32/sync/waitable-timer-objects
> you can go down to 0.5 ms, which seems currently be the lowest
> interval that Windows kernel will wake anything up in practice.
>
> - Kacper
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
I can use the code similar from AMF with timeBeginPeriod/timeEndPeriod
API(please ignore commented code)
AMF/amf/public/common/Windows/ThreadWindows.cpp
at master · GPUOpen-LibrariesAndSDKs/AMF · GitHub
<https://github.com/GPUOpen-LibrariesAndSDKs/AMF/blob/master/amf/public/common/Windows/ThreadWindows.cpp#L303>
But in my opinion, an alternative suggestion from Zhao Zhili to use
av_usleep(500) will be more suitable for ffmpeg,
because I found similar code for QSV components in ffmpeg.
More information about the ffmpeg-devel
mailing list