[FFmpeg-devel] [PATCH] ac3enc: Add x86-optimized function to speed up log2_tab().
Justin Ruggles
justin.ruggles
Sun Feb 13 20:49:50 CET 2011
AC3DSPContext.ac3_max_msb_abs_int16() finds the maximum MSB of the absolute
value of each element in an array of int16_t.
---
Updated patch based on comments from Mans, Loren, and Ronald.
Added range constraint to function documentation.
Loren's suggestion of using min/max when available is faster.
Using the min/max approach for the C version is about 15% faster on
Athlon64 but 30% slower on Atom. The existing version is simpler so I
just left it as-is.
Ronald's suggestion of using shuffles+por instead of doing the final
calculations from the stack is faster in some situations and about the
same in others. But it's simpler overall and avoids messing around
with the stack so I used it.
Athlon64 X2 6000+:
C: 20718
MMX: 3590
MMX2: 2906
SSE2: 2062
Atom 330:
C: 31838
MMX: 7394
SSE2: 3138
SSSE3: 2759
libavcodec/ac3dsp.c | 9 +++++
libavcodec/ac3dsp.h | 11 +++++++
libavcodec/ac3enc_fixed.c | 11 ++-----
libavcodec/x86/ac3dsp.asm | 69 +++++++++++++++++++++++++++++++++++++++++++
libavcodec/x86/ac3dsp_mmx.c | 11 +++++++
5 files changed, 103 insertions(+), 8 deletions(-)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ac3enc-Add-x86-optimized-function-to-speed-up-log2_t.patch
Type: text/x-patch
Size: 6326 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20110213/bda0851e/attachment.bin>
More information about the ffmpeg-devel
mailing list