[FFmpeg-devel] [PATCH] x86: hpeldsp: implement SSSE3 version of _xy2
Michael Niedermayer
michaelni at gmx.at
Sat May 24 17:00:26 CEST 2014
On Fri, May 23, 2014 at 01:03:06AM +0200, Christophe Gisquet wrote:
> Patch includes benchmarks. A new yasm macro was created to avoid the
> code from getting messier.
>
> --
> Christophe
> hpeldsp.asm | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> hpeldsp_init.c | 22 +++++++++++++++++
> 2 files changed, 92 insertions(+)
> 64c7889ce510102b4204ed58cccb29e191581ae2 0003-x86-hpeldsp-implement-SSSE3-version-of-_xy2.patch
> From ddde8bbd9891b79dbccde1c70d739ec38ab301ea Mon Sep 17 00:00:00 2001
> From: Christophe Gisquet <christophe.gisquet at gmail.com>
> Date: Thu, 22 May 2014 23:47:06 +0200
> Subject: [PATCH 3/3] x86: hpeldsp: implement SSSE3 version of _xy2
>
> Loading pb_1 rather than pw_8192 was benchmarked to be more efficient.
> Loading of the 2 yields no advantage. Loading of one saves ~11 cycles.
>
> decicycles count:
> put8: 3223(mmx) -> 2387
> avg8: 2863(mmxext) -> 2125
> put16: 4356(sse2) -> 3553
> avg16: 4481(sse2) -> 3513
applied
thanks
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Complexity theory is the science of finding the exact solution to an
approximation. Benchmarking OTOH is finding an approximation of the exact
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140524/dd5abe80/attachment.asc>
More information about the ffmpeg-devel
mailing list