[FFmpeg-devel] [PATCH] Add x86-optimized function	ac3_or_abs_int16() and use in log2_tab().
    Måns Rullgård 
    mans
       
    Sat Feb 12 13:48:23 CET 2011
    
    
  
Loren Merritt <lorenm at u.washington.edu> writes:
>>+%macro PABSW2_MMX 6 ; dst1, dst2, src1, src2, temp1, temp2
>>+    mova    %1, %3
>>+    mova    %2, %4
>>+    mova    %5, %1
>>+    mova    %6, %2
>>+    psraw   %5, 15
>>+    psraw   %6, 15
>>+    pxor    %1, %5
>>+    pxor    %2, %6
>>+    psubw   %1, %5
>>+    psubw   %2, %6
>>+%endmacro
>>+
>>+%macro PABSW2_SSSE3 6 ; dst1, dst2, src1, src2, unused, unused
>>+    pabsw   %1, %3
>>+    pabsw   %2, %4
>>+%endmacro
>
> Already in x86util.asm
>
> But you don't actually want to compute (bit-or of abs), right? You
> want to compute (log2 of max of abs). Since MMX has min/max
> instructions and doesn't have abs, try running signed min/max first
> and doing abs only once in the tail.
> That way might be faster in C too, on cpus with scalar cmov/min/max
> and without scalar abs.
So the description could be made more general, allowing both approaches.
-- 
M?ns Rullg?rd
mans at mansr.com
    
    
More information about the ffmpeg-devel
mailing list