[FFmpeg-devel] [PATCH] x86: hpeldsp: implement SSSE3 version of _xy2

Michael Niedermayer michaelni at gmx.at
Sat May 24 17:00:26 CEST 2014


On Fri, May 23, 2014 at 01:03:06AM +0200, Christophe Gisquet wrote:
> Patch includes benchmarks. A new yasm macro was created to avoid the
> code from getting messier.
> 
> -- 
> Christophe

>  hpeldsp.asm    |   70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  hpeldsp_init.c |   22 +++++++++++++++++
>  2 files changed, 92 insertions(+)
> 64c7889ce510102b4204ed58cccb29e191581ae2  0003-x86-hpeldsp-implement-SSSE3-version-of-_xy2.patch
> From ddde8bbd9891b79dbccde1c70d739ec38ab301ea Mon Sep 17 00:00:00 2001
> From: Christophe Gisquet <christophe.gisquet at gmail.com>
> Date: Thu, 22 May 2014 23:47:06 +0200
> Subject: [PATCH 3/3] x86: hpeldsp: implement SSSE3 version of _xy2
> 
> Loading pb_1 rather than pw_8192 was benchmarked to be more efficient.
> Loading of the 2 yields no advantage. Loading of one saves ~11 cycles.
> 
> decicycles count:
> put8:  3223(mmx)    -> 2387
> avg8:  2863(mmxext) -> 2125
> put16: 4356(sse2)   -> 3553
> avg16: 4481(sse2)   -> 3513

applied

thanks

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Complexity theory is the science of finding the exact solution to an
approximation. Benchmarking OTOH is finding an approximation of the exact
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140524/dd5abe80/attachment.asc>


More information about the ffmpeg-devel mailing list