[FFmpeg-devel] [PATCH] use AV_RB16 in cabac refill
Alexander Strange
astrange
Fri Mar 26 02:17:40 CET 2010
On Mar 25, 2010, at 4:08 AM, David Conrad wrote:
> On Mar 25, 2010, at 3:30 AM, Alexander Strange wrote:
>
>> Measured 1 cycle faster decode_cabac_residual on x86-64. Didn't try anywhere else, but I'd be a little interested in what arm does.
>
> It ought to be 2 instruction less and faster. However, both llvm and gcc decide to zero extend from 16 bits twice, and (llvm-)gcc-4.2 decides to load bytestream twice.
Hmm, zero-extending in bswap_16 isn't really surprising, since asm operands are always extended to int.
The only solution there is to write AV_RB16 in asm too.
--disable-asm is remarkably bad, I think it should be using (p[0] << 8 | p[1]) instead of __attribute__((packed)) and bswap_16 when FAST_UNALIGNED isn't defined.
This isn't a really important change, so it can wait.
More information about the ffmpeg-devel
mailing list