[FFmpeg-devel] [PATCH] Fix non-rounding up to next 16-bit aligned bug in IFF decoder
Måns Rullgård
mans
Thu Apr 29 15:15:41 CEST 2010
Sebastian Vater <cdgs.basty at googlemail.com> writes:
> Just got the idea, we can get rid of the GetBitContext
> completely...Instead of reading 4 bits, we simply read a byte:
> const uint8_t lut_offsets = *buf++; // instead of get_bits(gb,4);
That's a separate thing.
> Then we do loop unrolling by 8 and do two accesses to lut one with >> 4
> and one with & 0x0F, or we get even rid of this and create a lut table
> with 256 entries using AV_WN64A / AV_RN64A ;-)
>
> The advance here is that on a 64 bit CPU we get another nice speed
> improvement ;-)
> If we avoid calculations with AV_RN64A etc.
Those macros don't do any calculations. All they do is some magic to
avoid type aliasing errors.
> gcc just should use 2 registers on 32-bit CPU and that's it.
Should, but doesn't.
> Since we got rid of GetBitContext that shall be no problem ;-)
>
> Whatever way we take, I think it's better to create the tables based on
> endianess instead of doing this in the loop, because this adds an extra
> bit swap instruction in the inner loop if the target CPU has wrong
> alignment, which I'ld prefer to avoid. ;-)
Yes.
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list