[Ffmpeg-devel] [RFC] smallcpy for h264
Michael Niedermayer
michaelni
Sat Oct 7 14:56:05 CEST 2006
Hi
On Sat, Oct 07, 2006 at 02:18:52PM +0200, Luca Barbato wrote:
> Michael Niedermayer wrote:
> > but before i will agree to this i want
> > 1. to know why we spend a significant time doing small memcpys
>
> Loren do you have time to have a look on it? The on x86simd codepath has
> many of them...
>
> > 2. why ppc doesnt inline memcpy like x86 does
>
> inlined memcpy are triggered with -O3 iirc, so having them doesn't help
> speed at all (see the threads about avoiding -O3 to get better speed)
-O2 vs. -O3 gains where about dsputil* and these are already arch specific
so no small_cpy is needed
furthermore if -O2 is faster its likely not changing memcpy inlining
behavior or memcpy is so insignifcant that it doesnt matter
> I'll dig glibc to see if we have inlined variants available.
this belongs to gcc not libc, libc cannot detect fixed size mempcy
and inline a special optimized version
>
> >
> > furthermore these aligment related changes must be split,reviewed
> > and applied before any benchmarking makes sense (= your benchmark
> > of missaliged arrays with memcpy vs. your code with aligned arrays
> > might show more the speed difference of alignment and less that
> > of the actual code)
>
> please check the attached code.
rejected, the only memcpy i have found which touch these are in the init code
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is
More information about the ffmpeg-devel
mailing list