[FFmpeg-devel] [PATCH] Mimic decoder
Ramiro Polla
ramiro
Mon Mar 17 20:01:05 CET 2008
Hello Michael,
Michael Niedermayer wrote:
> On Sun, Mar 16, 2008 at 04:15:41PM -0300, Ramiro Polla wrote:
> [...]
>>> [...]
>>>>>>> and
>>>>>>> value= 1 - (1<<num_bits);
>>>>>>> if(num_bits>1)
>>>>>>> value+= get_bits(num_bits-1);
>>>>>>> if(get_bits1())
>>>>>>> value -= value;
>>>>>>> block[ctx->scantable.permutated[pos]] = value;
>>>>>>> Or if you think the table matters speedwise then make it static dont
>>>>>>> duplicate it in every decoder instance.
>>>>>> ref calc static
>>>>>> mimic.o : 51868 51304 52192
>>>>>> avg time: 1.692 1.712 1.728
>>>>>>
>>>>>> That's very odd... Shouldn't the static table be at least as fast as
>>>>>> the
>>>>>> current code?
>>>>> Yes
>>>>> If you want to debug it the asm output from gcc would be a start but i
>>>>> doubt its worth the time spend.
>>>> With a bigger sample, I got
>>>> ref calc static
>>>> avg time: 6.688 6.932 6.744
>>> And the standard deviations? Mean is only one side of the story ...
>> The more tests I run, the more innacurate they become.
>>
>> ref calc static
>> avg time: 6.819993 6.873992 6.823993
>> std dev : 0.017436 0.022716 0.022627
>>
>> What's the best way to run benchmarks?
>
> Id try START/STOP_TIMER around the function you try to benchmark but
> watch the skip number if the amount of time spend in the function can vary
> alot.
>
> Anyway the difference between ref and static is considerably less than the
> standard deviation so i think the argument that its faster than static is
> about as likely as the opposit argument.
>
>
>> With no X, I run 20 runs of
>> ./ffmpeg_g -i ../data/y.ml20 -vcodec rawvideo -f rawvideo -r 5 -y /dev/null
>>
>> Is there a way to just benchmark demuxing+decoding without anything else?
>> (besides writing my own simple program).
>>
>> The input video has less than 5 frames per second, so I use -r 5 to avoid
>> outputting 1000 fps.
>>
>>>>>> Attached are the 2 patches I used to test. (lut_calc.diff and
>>>>>> lut_static.diff). Anyways, most FFmpeg stuff introduced a considerable
>>>>>> speedup over the original code, so I'm tempted to apply one of these
>>>>>> two.
>>>>>> Could I use the hardcoded tables ifdef there?
>>>>> I dont see much sense in it. I mean the calc code doesnt need the table
>>>>> at
>>>>> all and needs around 100 byte while the table needs 1024.
>>>> If it's ok I'd prefer to keep the current code with vlcdec_lookup in
>>>> MimicContext.
>>> You can keep it if you merge the *qscale into the table as its not
>>> const then anymore ...
>> I don't understand how. qscale depends on quality, which can change per
>> frame.
>
> Recalculate the table when it changed.
>
>
>> qscale also depends on pos...
>
> you could have a second table
>
> anyway thats just a suggestion, i am not saying that it would be a good idea
> what i am saying is that having a const table in the context is a bad idea.
Well, the values can be be different for chroma and luma samples as they
have a different min value for qscale. So I'd need to keep at least two
tables, and also pass that to vlc_decode_block. The tables would need to
change when quality!=ctx->quality and I'd need to keep track of quality.
Then there's the fact that pos 1 and 2 don't use qscale but rather a
fixed <<. Then I'd need more tables for that, or another way to work
around it...
Made static, finally. It's simpler. =)
> [...]
>
>> for(plane = 0; plane < 3; plane++) {
>> const int is_chroma = !!plane;
>> const int qscale = av_clip(10000-quality,is_chroma?1000:2000,10000)<<2;
>> const int stride = cur_frame.linesize[plane];
>> const uint8_t *base_src = prev_frame.data[plane];
>> uint8_t *base_dst = cur_frame. data[plane];
> ^
> random space?
Removed. At some revision it used to make things look cleaner...
> [...]
>> ctx->num_vblocks[i] = height >> (3 + !!i);
>> ctx->num_hblocks[i] = width >> (3 + !!i);
>> if(i && height & 15)
>> ctx->num_vblocks[i]++;
>
> ctx->num_vblocks[i] = -((-height) >> (3 + !!i));
Hmm... wouldn't that be only if height was negative? It is always
positive in the data.
>> }
>> } else
>> if(width != ctx->avctx->width || height != ctx->avctx->height) {
>> av_log(avctx, AV_LOG_ERROR, "resolution changing is not supported\n");
>> return -1;
>> }
>>
>
>> ctx->picture.data[0] = NULL;
>> ctx->picture.reference = 1;
>> if(avctx->get_buffer(avctx, &ctx->picture)) {
>
> the = NULL before get_buffer() looks very suspicious, you arent forgetting to
> release them somewhere?
The decoder always get_buffer for a new picture at the start of
decode_frame. That frame is kept to be used as back-reference for
another 16 frames. At the end of decode_frame, it releases the last
frame if it's no longer needed. decode_end then releases all frames.
Ramiro Polla
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mimic.c
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080317/8ae10946/attachment.asc>
More information about the ffmpeg-devel
mailing list