[FFmpeg-devel] [PATCH]levc/hevc_cabac Optimise ff_hevc_hls_residual_coding (especially ARM)

Fri Jan 22 18:54:30 CET 2016

Hi

>Hi,
>
>2016-01-22 14:29 GMT+01:00 John Cox <jc at kynesim.co.uk>:
>>>This is a big slowdown on Win64 and UHD-bluray like sequences, but
>>>that can be switched off in that case.
>>
>> I'm a bit surprised that it generated a big slowdown - some cache must
>> be running just on the edge, but yes if you normally have hi-bitrate
>> stuff then it isn't wanted.  On my test streams the bitrates were
>> normally quite low - quite unlike what I would expect from blu-ray
>> sequences.
>
>Initial (4 sequences):
>    6553 decicycles in g, 8387110 runs,   1498 skips
>    5916 decicycles in g,33546118 runs,   8314 skips
>    5028 decicycles in g,67101499 runs,   7365 skips
>    4729 decicycles in g,33548420 runs,   6012 skips
>
>Deactivating USE_N_END_1:
>    4746 decicycles in g,16774296 runs,   2920 skips
>    5373 decicycles in g,33545629 runs,   8803 skips
>    4141 decicycles in g,67098928 runs,   9936 skips
>    3869 decicycles in g,33544593 runs,   9839 skips
>
>But I see the first one surprisingly having half the iterations (but
>this has almost converged at this point).
>So 10-20%.

Coo - that is big.
How are you profiling that and with what streams?

>I think it has more to do with cache pressure, both code, which
>increases from 8 to 9.5KB, and data, with already "large" tables in a
>loop that may need to tight.

I agreee (and it is what I was trying to suggest in my previous
comment).  It also suggests that on x86 you might benefit from
non-inlined cabac_gets to keep the code size small.

>> Default it to off on x86 but on on ARM?
>
>Yes, I think so.
Is ARCH_X86/ARM an appropriate switch for this?

Regards

JC