[FFmpeg-devel] [PATCH] Fix non-rounding up to next 16-bit aligned bug in IFF decoder

Thu Apr 29 15:15:41 CEST 2010

Sebastian Vater <cdgs.basty at googlemail.com> writes:

> Just got the idea, we can get rid of the GetBitContext
> completely...Instead of reading 4 bits, we simply read a byte:
> const uint8_t lut_offsets = *buf++; // instead of get_bits(gb,4);

That's a separate thing.

> Then we do loop unrolling by 8 and do two accesses to lut one with >> 4
> and one with & 0x0F, or we get even rid of this and create a lut table
> with 256 entries using AV_WN64A / AV_RN64A ;-)
>
> The advance here is that on a 64 bit CPU we get another nice speed
> improvement ;-)
> If we avoid calculations with AV_RN64A etc.

Those macros don't do any calculations.  All they do is some magic to
avoid type aliasing errors.

> gcc just should use 2 registers on 32-bit CPU and that's it.

Should, but doesn't.

> Since we got rid of GetBitContext that shall be no problem ;-)
>
> Whatever way we take, I think it's better to create the tables based on
> endianess instead of doing this in the loop, because this adds an extra
> bit swap instruction in the inner loop if the target CPU has wrong
> alignment, which I'ld prefer to avoid. ;-)

Yes.

-- 
M?ns Rullg?rd
mans at mansr.com