[FFmpeg-devel] Decoding performance -f rawvideo pipe:1 vs BMP images output

Clément Péron peron.clem at gmail.com
Fri Dec 6 20:30:08 EET 2024


Hi,

On Fri, 6 Dec 2024 at 18:55, Clément Péron <peron.clem at gmail.com> wrote:
>
> Hi,
>
> I am trying to convert a RTSP stream to a series of frames that I send
> to a stdout PIPE with a low latency .
>
> I first tried this command.
>
> "ffmpeg -hwaccel cuda -flags +low_delay -fflags +nobuffer -nostats
> -debug_ts -re -rtsp_flags prefer_tcp -rtsp_transport tcp -i
> RTSP_STREAM -f rawvideo -fps_mode passthrough -pix_fmt rgb24 pipe:1 >
> /dev/null"
> Instead of dev/null I plug my app.
>
> Interestingly, most of the time is spent in the encode part.
> >>>>
> latency(total:136.931ms, decode 3.355ms/2%, decode-filter: 8.426ms/6%,
> filter 3.191ms/2%, encode 120.774ms/88%)
> latency(total:73.519ms, decode 1.592ms/2%, decode-filter: 2.047ms/2%,
> filter 2.928ms/3%, encode 66.856ms/90%)
> latency(total:139.766ms, decode 1.885ms/1%, filter 1.898ms/1%, encode
> 135.023ms/96%)
> latency(total:71.03ms, decode 3.524ms/4%, decode-filter: 1.503ms/2%,
> filter 1.189ms/1%, encode 64.743ms/91%)
> latency(total:134.037ms, decode 1.935ms/1%, encode 130.176ms/97%)
> <<<<<
>
> If I compare this to multiple BMP files
> "ffmpeg -hwaccel cuda -flags +low_delay -fflags +nobuffer -debug_ts
> -nostats -re -rtsp_flags prefer_tcp -rtsp_transport tcp -i RTSP_STREAM
> -fps_mode passthrough -pix_fmt rgb24 output_image_%03d.bmp"
>
> >>>>
> latency(total:18.478ms, decode 3.222ms/17%, decode-filter:
> 9.715ms/52%, filter 2.576ms/13%, encode 2.771ms/14%)
> latency(total:13.019ms, decode 1.857ms/14%, decode-filter:
> 2.184ms/16%, filter 3.348ms/25%, encode 5.468ms/42%)
> latency(total:6.565ms, decode 1.642ms/25%, decode-filter: 1.1ms/16%,
> filter 0.63ms/9%, encode 3.105ms/47%)
> latency(total:6.628ms, decode 2.116ms/31%, decode-filter: 1.131ms/17%,
> filter 1.851ms/27%, filter-encode: 0.085ms/1%, encode 1.324ms/19%,
> encode-mux: 0.075ms/1%)
> latency(total:3.932ms, decode 1.642ms/41%, decode-filter: 0.588ms/14%,
> filter 0.863ms/21%, encode 0.779ms/19%)
> latency(total:4.528ms, decode 1.91ms/42%, decode-filter: 0.766ms/16%,
> filter 1.061ms/23%, encode 0.694ms/15%)
> <<<<
>
> Then the encoded time is much more acceptable.
> Do you know why such a difference exists ?

After investigating it seems that the AV_CODEC_CAP_FRAME_THREADS
impact the processing time a lot!
+++ b/libavcodec/rawenc.c
@@ -98,7 +98,7 @@ const FFCodec ff_rawvideo_encoder = {
     .p.id           = AV_CODEC_ID_RAWVIDEO,
-    .p.capabilities = AV_CODEC_CAP_DR1 | AV_CODEC_CAP_FRAME_THREADS |
+    .p.capabilities = AV_CODEC_CAP_DR1 |
                       AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE,
     .init           = raw_encode_init,

>>>
latency(total:7.312ms, decode 2.347ms/32%, decode-filter: 0.901ms/12%,
filter 1.293ms/17%, encode 2.71ms/37%)
latency(total:7.982ms, decode 3.369ms/42%, decode-filter: 0.617ms/7%,
filter 2.406ms/30%, encode 1.461ms/18%)
latency(total:4.63ms, decode 2.401ms/51%, decode-filter: 0.733ms/15%,
filter 0.912ms/19%, encode 0.524ms/11%)
latency(total:4.291ms, decode 2.154ms/50%, decode-filter: 0.568ms/13%,
filter 1.041ms/24%, encode 0.475ms/11%)
latency(total:5.103ms, decode 2.37ms/46%, decode-filter: 0.75ms/14%,
filter 1.238ms/24%, encode 0.652ms/12%)
latency(total:5.658ms, demux-decode: 0.112ms/1%, decode 2.283ms/40%,
decode-filter: 1.28ms/22%, filter 1.25ms/22%, filter-encode:
0.077ms/1%, encode 0.634ms/11%)
<<<<

I tried to redo without patch and adding "-threads 1" to my command
but the performance is still a bit lower.
>>>>
latency(total:15.234ms, decode 2.591ms/17%, decode-filter: 0.807ms/5%,
filter 5.803ms/38%, encode 5.893ms/38%)
latency(total:36.815ms, decode 5.125ms/13%, decode-filter: 1.445ms/3%,
filter 27.749ms/75%, encode 2.193ms/5%)
latency(total:13.314ms, decode 2.449ms/18%, decode-filter: 0.647ms/4%,
filter 8.811ms/66%, encode 1.319ms/9%)
latency(total:12.199ms, decode 2.211ms/18%, decode-filter: 0.561ms/4%,
filter 7.701ms/63%, encode 1.64ms/13%)
latency(total:11.915ms, decode 2.35ms/19%, decode-filter: 0.59ms/4%,
filter 8.014ms/67%, encode 0.881ms/7%)
latency(total:21.685ms, decode 2.493ms/11%, decode-filter: 1.024ms/4%,
filter 16.372ms/75%, encode 1.62ms/7%)
latency(total:12.756ms, decode 2.366ms/18%, decode-filter: 0.63ms/4%,
filter 8.206ms/64%, encode 1.457ms/11%)
latency(total:10.902ms, decode 2.163ms/19%, decode-filter: 0.865ms/7%,
filter 6.629ms/60%, encode 1.169ms/10%)
latency(total:24.758ms, decode 3.356ms/13%, decode-filter: 0.962ms/3%,
filter 19.548ms/78%, encode 0.788ms/3%)
latency(total:10.619ms, decode 2.122ms/19%, decode-filter: 0.652ms/6%,
filter 7.097ms/66%, encode 0.685ms/6%)
<<<<

Is this impact when threading is enabled expected?
Is there another cmd param to disable the threading?

Thanks

>
> Thanks for your help,
> Clement


More information about the ffmpeg-devel mailing list