[FFmpeg-devel] [PATCH] Optimization of original IFF codec
Sebastian Vater
cdgs.basty
Fri Apr 23 12:58:45 CEST 2010
Hi a 38573583th time!
M?ns Rullg?rd a ?crit :
> I prefer to write just unsigned, saving a few keystrokes, but others
> may disagree. Do as you please.
>
Fixed!
>> if (avctx->bits_per_coded_sample > 8) {
>> av_log(avctx, AV_LOG_ERROR, "bit_per_coded_sample > 8 not supported\n");
>> @@ -52,8 +53,8 @@
>> av_log(avctx, AV_LOG_ERROR, "palette data underflow\n");
>> return AVERROR_INVALIDDATA;
>> }
>> - for (i=0; i < count; i++) {
>> - pal[i] = 0xFF000000 | AV_RB24( avctx->extradata + i*3 );
>> + for (i=0, j=0; i < count; i++, j += 3) {
>> + pal[i] = 0xFF000000 | AV_RB24( avctx->extradata + j);
>> }
>> return 0;
>> }
>>
>
> Did you check if this makes any difference in the generated code? The
> original is more readable.
>
No, and I doubt that many compilers can do such stuff automatically.
>> @@ -71,7 +72,7 @@
>> return AVERROR_INVALIDDATA;
>> }
>>
>> - s->planesize = avctx->width / 8;
>> + s->planesize = avctx->width >> 3;
>> s->planebuf = av_malloc(s->planesize + FF_INPUT_BUFFER_PADDING_SIZE);
>> if (!s->planebuf)
>> return AVERROR(ENOMEM);
>>
>
> This is most definitely a number being divided.
>
Fixed.
>> @@ -95,12 +96,16 @@
>> * @param plane plane number to decode as
>> */
>> #define DECLARE_DECODEPLANE(suffix, type) \
>> -static void decodeplane##suffix(void *dst, const uint8_t *const buf, int buf_size, int bps, int plane) \
>> +static inline void decodeplane##suffix(void *dst, \
>> + const uint8_t *const buf, \
>> + const unsigned int buf_size, \
>> + const unsigned int bps, \
>> + const unsigned int plane) \
>> { \
>> GetBitContext gb; \
>> - int i, b; \
>> + unsigned int i, b; \
>> init_get_bits(&gb, buf, buf_size * 8); \
>> - for(i = 0; i < (buf_size * 8 + bps - 1) / bps; i++) { \
>> + for(i = 0; i < ((buf_size << 3) + bps - 1) / bps; i++) { \
>> for (b = 0; b < bps; b++) { \
>> ((type *)dst)[ i*bps + b ] |= get_bits1(&gb) << plane; \
>> } \
>>
>
> Once again, did you check the generated code? If it makes no
> difference, the original code is better.
>
I don't have to check the generated code to know that changing signed to
unsigned will change signed multiplication to unsigned multiplication
assembly instruction ;)
Also there are many architectures out there, which do unsigned
multiplication faster than signed. This even is more true for divisions
and if you look at the for body...
> And again, and again, and again...
>
Fixed.
> NEVER make changes like this without looking at the output
Just have a knot in my brain, what was the way to extract disassembly
from object files?
I'll post the updated patch, when I compared the actual output.
--
Best regards,
:-) Basty/CDGS (-:
More information about the ffmpeg-devel
mailing list