[FFmpeg-user] Decoding performance, 6.1.1 vs. 4.4.4

René J. V. Bertin rjvbertin at gmail.com
Wed Jan 10 14:18:40 EET 2024


Hi,

I've just built and installed FFmpeg 6.1.1 in parallel to my trust 4.4.4 
installation, using the exact same compiler options and clang (9 and 8, 
respectively; v6.1.1 does get debug info via `-g`). This is about performance on 
2 older machines: a Mac with a 2nd gen i7 (Sandybridge) and a Linux notebook 
with an N3150 CPU, both having only the eGPU for graphics and thus limited to 
H264 decoding in terms of hardware acceleration.

To evaluate performance I used 2 1080p videos; a VP9 one downloaded from YT 
(30Hz) and the "Big Buck Bunny" video from the Blender foundation 
(bbb_sunflower_1080p_60fps_normal.mp4; H264, 60Hz).

I used `-threads 4 -i <video> -benchmark -f null -`.

On the Mac, performance compares as you'd prefer to see it: v6.1.1 is 
consistently a tiny bit faster for CPU decoding than v4.4.4 while hw-accelerated 
decoding is just as fast (or slow if you want).

On Linux the situation is opposite, sadly. A consistent small loss of 
performance for v6.1.1 in CPU decoding but the big surprise was a somewhat 
bigger loss in VAAPI-accelerated decoding.

FFmpeg 4.4.4:
```
> time ffmpeg -threads 4 -hwaccel vaapi -i bbb_sunflower_1080p_60fps_normal.mp4 -
benchmark -f null -
frame=38072 fps=123 q=-0.0 Lsize=N/A time=00:10:34.56 bitrate=N/A speed=2.05x    
video:19928kB audio:356706kB subtitle:0kB other streams:0kB global headers:0kB 
muxing overhead: unknown
bench: utime=198.046s stime=44.752s rtime=309.083s
bench: maxrss=168904kB
198.431 user_cpu 44.791 kernel_cpu 5:09.48 total_time 78.5%CPU {168904M 13F 
2504898R 695857I 0O 462253w 226657c}
```

FFmpeg 6.1.1:
```
> time ffmpeg6 -threads 4 -hwaccel vaapi -i bbb_sunflower_1080p_60fps_normal.mp4 
-benchmark -f null -
frame=38072 fps= 91 q=-0.0 Lsize=N/A time=00:10:34.55 bitrate=N/A speed=1.51x    
bench: utime=245.679s stime=119.927s rtime=419.141s
bench: maxrss=137632kB
246.007 user_cpu 119.987 kernel_cpu 6:59.49 total_time 87.2%CPU {137632M 0F 
16342471R 695447I 0O 847360w 243741c}

```

(sorry for the linebreaks, I'm posting via a newsgroup app.)

To me this looks like there's been a regression that causes an increased 
overhead in getting the content onto and/or off the GPU.

Is that possible? Could it have to do with the fact that FFmpeg6 auto-enables 
support for hw-acceleration via Vulkan (a moot selling point on this hardware) 
and if so would it help to disable that support?

Thanks,
R.



More information about the ffmpeg-user mailing list