[FFmpeg-devel] [PATCH] avcodec/amfenc: Fix for windows imprecise sleep

Evgeny Pavlov lucenticus at gmail.com
Tue Oct 17 20:11:19 EEST 2023


The reason for using av_usleep() here is that AMF API doesn’t provide an
API for explicit wait. There are two modes to get output from encoder:

1. Polling with some sleep to avoid CPU thrashing – currently used in FFmpeg

2. Set timeout parameter on AMF encoder and QueryOutput call will block
till output is available or the timeout happens.

#2 is the preferable way but it is designed more to be used with a separate
polling thread. With a single-thread approach in FFmpeg, the use of timeout
can block input submission making things slower.  This is even more
pronounced when B-frames are enabled and several inputs are needed to produce
the first output.

The condition of this sleep is in special events (primarily when amf input
queue is full), not the core loop part. During the experiments the cpu
increasing is about 2-4% or so, not a burst.

For low resolution encoding,  these changes bring significant performance
improvement (about 15%). It will not bring improvement for high resolution
such as 4K.


Thanks,

Evgeny

вт, 17 окт. 2023 г. в 03:26, Zhao Zhili <quinkblack at foxmail.com>:

>
> > 在 2023年10月17日,上午5:24,Mark Thompson <sw at jkqxz.net> 写道:
> >
> > On 16/10/2023 10:13, Evgeny Pavlov wrote:
> >> This commit reduces the sleep time on Windows to improve AMF encoding
> >> performance on low resolution input videos.
> >> This fix is for Windows only, because sleep() function isn't
> >> very accurate on Windows OS.
> >> Fix for issue #10622
> >> Signed-off-by: Evgeny Pavlov <lucenticus at gmail.com>
> >> ---
> >>  libavcodec/amfenc.c | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >> diff --git a/libavcodec/amfenc.c b/libavcodec/amfenc.c
> >> index 061859f85c..0c95465d6e 100644
> >> --- a/libavcodec/amfenc.c
> >> +++ b/libavcodec/amfenc.c
> >> @@ -770,7 +770,11 @@ int ff_amf_receive_packet(AVCodecContext *avctx,
> AVPacket *avpkt)
> >>          if (query_output_data_flag == 0) {
> >>              if (res_resubmit == AMF_INPUT_FULL || ctx->delayed_drain
> || (ctx->eof && res_query != AMF_EOF) || (ctx->hwsurfaces_in_queue >=
> ctx->hwsurfaces_in_queue_max)) {
> >>                  block_and_wait = 1;
> >> +#ifdef _WIN32
> >> +                av_usleep(0); //Sleep() is not precise on Windows OS.
> >> +#else
> >>                  av_usleep(1000);
> >> +#endif
> >>              }
> >>          }
> >>      } while (block_and_wait);
> >
> > Wasting lots of power by spinning on a CPU core does not seem like a
> good answer to this problem.  (I mean, presumably that is why Windows isn't
> honouring your request for a short sleep, because it wants timers to have
> larger gaps to avoid wasting power.)
>
> If av_usleep is implemented via Sleep like current case, sleep 0 means
> yield current thread, so it’s not busy wait in normal case (but it can be
> busy wait).
>
> av_usleep(500) may looks better and do the same job by depending 500/1000
> = 0.
>
> I agree use sleep without real async is like a bug.
>
> >
> > Why is there a sleep here at all, anyway?  An API for hardware encoding
> should be providing a way for the caller to wait for an outstanding
> operation to complete.
> >
> > Thanks,
> >
> > - Mark
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>


More information about the ffmpeg-devel mailing list