[FFmpeg-devel] Memcpy Operation Duration
Ronald S. Bultje
rsbultje at gmail.com
Tue Oct 18 23:08:14 EEST 2016
Hi Ali,
On Tue, Oct 18, 2016 at 3:57 PM, Ali KIZIL <alikizil at gmail.com> wrote:
> 2016-10-18 22:44 GMT+03:00 Sven C. Dack <sven.c.dack at sky.com>:
>
> > On 18/10/16 20:26, Ali KIZIL wrote:
> >
> >> Hi Everyone,
> >>
> >> Today, I was analyzing memcpy duration in FFmpeg. I noticed that it is
> >> taking longer time compared to an optimized SSE, SSE2, MMX, MMX2, AVX or
> >> AVX2 based memcpy operation.
> >>
> >> I tried march=corei7-avx2 compiled FFmpeg version, it does not change
> the
> >> duration of memcpy operation.
> >> I also folowed https://trac.ffmpeg.org/wiki/C
> >> ompilationGuide#PerformanceTips
> >> .Same result. In addition, I tried gcc 6.2 if gcc if gcc is not
> selecting
> >> the correct flag. Same result again.
> >>
> >> This memcpy operations effect the fps decoding (and probably encoding)
> >> rates.
> >>
> >> In a case that uyvy422 to p010 3840x2160 unscaled convertion in
> rawvideo,
> >> fps rate increased from 44 fps to 52 fps on a Xeon E5 2630 v4.
> >>
> >> Do I miss anything when compiling FFmpeg for AVX2 or other flag
> optimised,
> >> or there need a fix in FFmpeg to direct some (or all) memcpy operations
> >> to
> >> a inherited memcpy operation which can decide flag for optimisation ?
> >> Or there is no such need and I am on a wrong path ?
> >>
> >> (As a side note, FFmpeg works performance on i7 Extreme cores compared
> to
> >> Xeon v4 processors.)
> >
> > Could be it's gcc's built-in version. It's been said that libc is
> > occasionally better at it than gcc's built-in version.
> >
> > Use -fno-builtin-memcpy and see what difference it makes.
>
>
I see, tomorrow morning I will give it a try.
> Thank you for the good idea. If it increase performance, maybe it will be a
> good idea to make a configure option.
configure has --extra-cflags=.. and --extra-ldflags=.. options to add
custom CC CLI arguments.
Ronald
More information about the ffmpeg-devel
mailing list