[FFmpeg-devel] adding RGBA and BGRA to nvenc.c
Andy Furniss
adf.lists at gmail.com
Mon Sep 12 16:51:17 EEST 2016
Andy Furniss wrote:
> I do know that I have really grabbed and encoded 1080p60 with my AMD
> h/w and including nv12 conversion gives a sane looking result -
>
> gst-launch-1.0 -f ximagesrc use-damage=0 startx=0 starty=0 endx=1919
> endy=1079 num-buffers=1000 ! queue ! videoconvert !
> video/x-raw,framerate=100/1,format=NV12 ! fakesink Setting pipeline
> to PAUSED ... Pipeline is live and does not need PREROLL ... Setting
> pipeline to PLAYING ... New clock: GstSystemClock Got EOS from
> element "pipeline0". Execution ended after 0:00:14.419928745 Setting
> pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline
> to NULL ... Freeing pipeline ...
>
> 1000/14.419928745 = 69.3
Over the weekend I looked at the CSC aspect of this without using
x11grab = benching bgr0 on tmpfs to nv12 and managed with a bit of luck
to get ffmpeg to beat gstreamer.
Starting point gstreamer bgr0 to nv12 = 70fps, to I420 68fps.
ffmpeg benched using -f null as -f rawvideo to ram or /dev/null is
slower and I suspect/hope for my intended usage = vaapi upload -f null
will be more representative, but of course I don't know that.
ffmpeg -f rawvideo -s 1920x1080 -pix_fmt bgr0 -i /mnt/ramdisk/out.bgr0
-pix_fmt nv12 -f null -
=41 fps, yuv420p = 66fps
So yuv420p is close to gstreamer but nv12 is poor.
By chance I wondered how much worse it would be if I used -sws_flags as
I have done in the past. Result it was faster, it turns out that
+full_chroma_inp takes yuv420p from 66 to 84fps and nv12 to 47fps.
The reason being that with no flags time is spent in bgr32toUV_half_c
with flag above I don't use that and see various sse in use like
ff_rgbatoUV_sse2.
nv12 is still too slow though. Looking with sysprof I see that time
is spent in yuv2nv12cX_c.
Seemed slow when remembering yuv420p -> nv12 conversions from the past
so I benched 1080p yuv420p -> nv12 and got > 500fps. Doing this didn't
use yuv2nv12cX_c at all so I got to make a new command line -
ffmpeg -f rawvideo -s 1920x1080 -pix_fmt bgr0 -i /mnt/ramdisk/out.bgr0
-vf scale=flags=+full_chroma_inp,format=yuv420p,format=nv12 -f null -
= 78fps, nice.
So at least I can beat gstreamer on CSC now. Testing the new commandline
with x11grab gets me close to gst using the legacy x11grab = 65 fps.
libxcb x11grab is 52 fps though, so it would be good if that can be
fixed up.
More information about the ffmpeg-devel
mailing list