[FFmpeg-devel] [PATCH] ac3enc: Add x86-optimized function to speed	up log2_tab().
    Justin Ruggles 
    justin.ruggles
       
    Sun Feb 13 20:49:50 CET 2011
    
    
  
AC3DSPContext.ac3_max_msb_abs_int16() finds the maximum MSB of the absolute
value of each element in an array of int16_t.
---
Updated patch based on comments from Mans, Loren, and Ronald.
Added range constraint to function documentation.
Loren's suggestion of using min/max when available is faster.
Using the min/max approach for the C version is about 15% faster on
Athlon64 but 30% slower on Atom.  The existing version is simpler so I
just left it as-is.
Ronald's suggestion of using shuffles+por instead of doing the final
calculations from the stack is faster in some situations and about the
same in others.  But it's simpler overall and avoids messing around
with the stack so I used it.
Athlon64 X2 6000+:
   C: 20718
 MMX:  3590
MMX2:  2906
SSE2:  2062
Atom 330:
    C: 31838
  MMX:  7394
 SSE2:  3138
SSSE3:  2759
 libavcodec/ac3dsp.c         |    9 +++++
 libavcodec/ac3dsp.h         |   11 +++++++
 libavcodec/ac3enc_fixed.c   |   11 ++-----
 libavcodec/x86/ac3dsp.asm   |   69 +++++++++++++++++++++++++++++++++++++++++++
 libavcodec/x86/ac3dsp_mmx.c |   11 +++++++
 5 files changed, 103 insertions(+), 8 deletions(-)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ac3enc-Add-x86-optimized-function-to-speed-up-log2_t.patch
Type: text/x-patch
Size: 6326 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20110213/bda0851e/attachment.bin>
    
    
More information about the ffmpeg-devel
mailing list