[FFmpeg-devel] [PATCH] Indeo5 decoder
Maxim
max_pole
Fri Apr 17 17:12:01 CEST 2009
Michael Niedermayer schrieb:
> On Fri, Apr 17, 2009 at 01:44:30PM +0200, Maxim wrote:
>
>> Michael Niedermayer schrieb:
>>
>>> On Tue, Apr 07, 2009 at 05:08:34PM +0200, Maxim wrote:
>>>
>>>
>>>> Michael Niedermayer schrieb:
>>>>
>>>>
>>>>> On Tue, Apr 07, 2009 at 10:52:34AM +0200, Maxim wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Michael Niedermayer schrieb:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Mon, Apr 06, 2009 at 08:41:57PM +0200, Maxim wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> [...]
>>>>>
>>>>>
>>>>>
>>>>>>>> +
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * Build static indeo5 dequantization tables.
>>>>>>>> + */
>>>>>>>> +static av_cold void build_dequant_tables(void)
>>>>>>>> +{
>>>>>>>> + int mat, i, lev;
>>>>>>>> + uint32_t q1, q2, sf1, sf2;
>>>>>>>> +
>>>>>>>> + for (mat = 0; mat < 5; mat++) {
>>>>>>>> + /* build 8x8 intra/inter tables for all 24 quant levels */
>>>>>>>> + for (lev = 0; lev < 24; lev++) {
>>>>>>>> + sf1 = ivi5_scale_quant_8x8_intra[mat][lev];
>>>>>>>> + sf2 = ivi5_scale_quant_8x8_inter[mat][lev];
>>>>>>>> +
>>>>>>>> + for (i = 0; i < 64; i++) {
>>>>>>>> + q1 = (ivi5_base_quant_8x8_intra[mat][i] * sf1) >> 8;
>>>>>>>> + q2 = (ivi5_base_quant_8x8_inter[mat][i] * sf2) >> 8;
>>>>>>>> + deq8x8_intra[mat][lev][i] = av_clip(q1, 1, 255);
>>>>>>>> + deq8x8_inter[mat][lev][i] = av_clip(q2, 1, 255);
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> 1..255 but they arent uint8_t
>>>>>>> av_clip() seems useless and the whole table precalc maybe as well
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> They were made uint16_t in order to achieve a compatibility with Indeo4
>>>>>> that uses 9bits dequant tables...
>>>>>> The table precalculation should help avoiding huge static tables...
>>>>>>
>>>>>>
>>>>>>
>>>>> let me clarify my question, what is gained by merging a multiply and shift
>>>>> into the table?
>>>>> is it faster? if so then by how much?
>>>>>
>>>>>
>> I did some research on that! Here are answers on your questions:
>>
>> Question: Is it faster? if so then how much?
>>
>> Yes, it's faster. I measured calc "time" using START/STOP_TIMER macs. I
>> did two tests on two different videos: one containing mostly light
>> colors (DPS190indeo.avi) and another containing mostly dark colors
>> (haegemonia.avi). The reason for this choice was that the light colors
>> require higher scalefactors to be used and therefore a multiply by a
>> higher number.
>> First test measured dezicycles consumed by the inverse quantization
>> using TABLE lookup/MUL. It was done in my x86 Laptop equipped with the
>> Indel Core Duo processor at 2 GHz. Here are the raw numbers:
>>
>
> could you show me the used code?
> Iam interrested to see how you did the MUL
>
in the "decode_block":
START_TIMER;
q = (base_tab[pos] * scale_tab[quant]) >> 8;
q = (q) ? q : 1;
if (q != 1 && val) {
if (val > 0) {
val = (val * q) + (q >> 1) - (q & 1);
} else
val = (val * q) - (q >> 1) + (q & 1);
}
trvec[pos] = val;
col_flags[pos & col_mask] |= !!val; /* track columns containing non-zero
coeffs */
STOP_TIMER("inverse_quant");
The tables pointers base_tab and scale_tab are prepared appropriately in
"decode_band"...
Regards
Maxim
More information about the ffmpeg-devel
mailing list