[Ffmpeg-devel] PATCH Blackfin optimized byte swapping mechanism

Tue Apr 17 14:49:40 CEST 2007

Michael Niedermayer writes:
 > Hi
 > 
 > On Tue, Apr 17, 2007 at 07:40:47AM -0400, Marc Hoffman wrote:
 > Content-Description: message body text
 > > 
 > >  > Low level bswap primitive for the Blackfin Architecture.
 > > 
 > > sorry mangled patch wrong encoding last time.
 > 
 > what advantage do these functions have over the default?
 > are they faster? if so you should provide some benchmarks

Sorry about the top post please forgive me

The current 32bit byte swap routine produces this code sequence

        R1 = 255 (X);
        R1 <<= 16;
        R1 = R0 & R1;
        R2 = R0 >> 24;
        R1 >>= 8;
        R2 = R2 | R1;
        R1 = 65280 (Z);
        R1 = R0 & R1;
        R1 <<= 8;
        R0 <<= 24;
        R1 = R1 | R0;
        R2 = R2 | R1;

        R0 = R2; <<--- result 

The suggested replacement is

    asm("%1 = %0 >> 8 (V);\n\t"
        "%0 = %0 << 8 (V);\n\t"
        "%0 = %0 | %1;\n\t"
        "%0 = PACK(%0.L, %0.H);\n\t"

So I guess this is about 300% improvement in performance for this function.

Marc