[FFmpeg-devel] [PATCH] Mimic decoder

Michael Niedermayer michaelni
Sun Mar 16 23:59:25 CET 2008


On Sun, Mar 16, 2008 at 04:15:41PM -0300, Ramiro Polla wrote:
[...]
>> [...]
>>>>>> and
>>>>>> value= 1 - (1<<num_bits);
>>>>>> if(num_bits>1)
>>>>>>     value+= get_bits(num_bits-1);
>>>>>> if(get_bits1())
>>>>>>     value -= value;
>>>>>> block[ctx->scantable.permutated[pos]] = value;
>>>>>> Or if you think the table matters speedwise then make it static dont
>>>>>> duplicate it in every decoder instance.
>>>>>           ref   calc  static
>>>>> mimic.o : 51868 51304 52192
>>>>> avg time: 1.692 1.712 1.728
>>>>>
>>>>> That's very odd... Shouldn't the static table be at least as fast as 
>>>>> the
>>>>> current code? 
>>>> Yes
>>>> If you want to debug it the asm output from gcc would be a start but i
>>>> doubt its worth the time spend.
>>> With a bigger sample, I got
>>>           ref   calc  static
>>> avg time: 6.688 6.932 6.744
>> And the standard deviations? Mean is only one side of the story ...
>
> The more tests I run, the more innacurate they become.
>
>           ref      calc     static
> avg time: 6.819993 6.873992 6.823993
> std dev : 0.017436 0.022716 0.022627
>
> What's the best way to run benchmarks?

Id try START/STOP_TIMER around the function you try to benchmark but
watch the skip number if the amount of time spend in the function can vary
alot.

Anyway the difference between ref and static is considerably less than the
standard deviation so i think the argument that its faster than static is
about as likely as the opposit argument.


> With no X, I run 20 runs of
> ./ffmpeg_g -i ../data/y.ml20 -vcodec rawvideo -f rawvideo -r 5 -y /dev/null
>
> Is there a way to just benchmark demuxing+decoding without anything else? 
> (besides writing my own simple program).
>
> The input video has less than 5 frames per second, so I use -r 5 to avoid 
> outputting 1000 fps.
>
>>>>> Attached are the 2 patches I used to test. (lut_calc.diff and 
>>>>> lut_static.diff). Anyways, most FFmpeg stuff introduced a considerable 
>>>>> speedup over the original code, so I'm tempted to apply one of these 
>>>>> two.
>>>>> Could I use the hardcoded tables ifdef there?
>>>> I dont see much sense in it. I mean the calc code doesnt need the table 
>>>> at
>>>> all and needs around 100 byte while the table needs 1024.
>>> If it's ok I'd prefer to keep the current code with vlcdec_lookup in 
>>> MimicContext.
>> You can keep it if you merge the *qscale into the table as its not
>> const then anymore ...
>
> I don't understand how. qscale depends on quality, which can change per 
> frame.

Recalculate the table when it changed.


> qscale also depends on pos...

you could have a second table

anyway thats just a suggestion, i am not saying that it would be a good idea
what i am saying is that having a const table in the context is a bad idea.


[...]

>     for(plane = 0; plane < 3; plane++) {
>         const int is_chroma = !!plane;
>         const int qscale = av_clip(10000-quality,is_chroma?1000:2000,10000)<<2;
>         const int stride = cur_frame.linesize[plane];
>         const uint8_t *base_src = prev_frame.data[plane];
>         uint8_t *base_dst = cur_frame. data[plane];
                                        ^
random space?

[...]
>             ctx->num_vblocks[i] = height >> (3 + !!i);
>             ctx->num_hblocks[i] = width  >> (3 + !!i);
>             if(i && height & 15)
>                 ctx->num_vblocks[i]++;

ctx->num_vblocks[i] = -((-height) >> (3 + !!i));


>         }
>     } else
>     if(width != ctx->avctx->width || height != ctx->avctx->height) {
>         av_log(avctx, AV_LOG_ERROR, "resolution changing is not supported\n");
>         return -1;
>     }
> 

>     ctx->picture.data[0] = NULL;
>     ctx->picture.reference = 1;
>     if(avctx->get_buffer(avctx, &ctx->picture)) {

the = NULL before get_buffer() looks very suspicious, you arent forgetting to
release them somewhere?


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No snowflake in an avalanche ever feels responsible. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080316/fdd11b91/attachment.pgp>



More information about the ffmpeg-devel mailing list