[Ffmpeg-devel] [PATCH] fix mpeg4 lowres chroma bug and increase h264/mpeg4 MC speed
Michael Niedermayer
michaelni
Thu Feb 8 18:52:14 CET 2007
Hi
On Thu, Feb 08, 2007 at 02:48:53AM -0800, Trent Piepho wrote:
[...]
> Anyway, there is an obvious way to make it faster that we both missed the
> first time:
>
> #define H264_CHROMA_OP2(S,D,T) "punpcklwd 2+" #S ", " #D "\n\t"
>
> This is about 4.38% faster than the my first patch, and 17.4% faster than
> the original code.
but slower then what is in svn which is what matters (it slows h.264 down)
>
> > >> I benchmarked my version, by measuring put_h264_chroma_mc2_mmx2() from
> > >> start to finish with rdtsc, as 12.8% faster than before.
> >
> > Speed was increased because memory-to-cache operation is faster during
> > reading comparing to writing. So, you are making fast cache preload
> > and it speeds up the code. Is the patch faster if you test it during
> > decode operation but not alone?
>
> The speed increase is mainly from the changing "x*y" into "xtimesy[x][y]",
> which changes an "imull %ebx, %edx" into "movzbl xtimesy(%ebx,%edx,8),
> %ecx". The latter really is much faster (really, it is! benchmark it (I
> did)).
>
> I've attached a new version, with the faster asm code. I also got rid of
> the bogus gas warning about a missing operand, from the "2+%0" offsetting a
> memory reference construct. In this case it was a minor tweak to the asm
> code to avoid the warning and still generate the exact same code.
could you send seperate patches for each separate change
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070208/23a77c50/attachment.pgp>
More information about the ffmpeg-devel
mailing list