[FFmpeg-devel] [PATCH] Faster CABAC H.264 residual decoding
Måns Rullgård
mans
Sun Apr 27 13:24:49 CEST 2008
Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
> On Sunday 27 April 2008, M?ns Rullg?rd wrote:
>> matthieu castet <castet.matthieu at free.fr> writes:
>> > Jason Garrett-Glaser wrote:
>> >> On the advice of #ffmpeg-devel I have made a version with uint8_t
>> >> arrays instead of int.
>> >
>> > Don't forget that some cpu (arm for example) don't have native 8 bits
>> > operation. Everything is done in 32 bits, and 8 bits behavior is
>> > emulated with extra operation.
>>
>> ARM has byte load and store instructions. All ALU operations are
>> 32-bit, except for certain multiplies. I doubt this is a problem
>> here.
>>
>> The only recent CPU I know of that lacks byte load/store is the first
>> generation of the Alpha.
>
> Probably he just wanted to say that reading bytes has higher latency
> (+1 cycle extra) than reading ints on at least some ARM cores (ARM9).
Where do you find this information? The ARM926 data sheet only
mentions the 1-cycle penalty for shifted offsets.
> On the other hand, indexing bytes in array does not require shifted
> offset (which may also introduce some kind of penalty).
A left shift by 2 has no penalty on ARMv6.
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list