[FFmpeg-devel] [PATCH 2/4] lavc/mpegvideo: use H263DSP dequant function
Andreas Rheinhardt
andreas.rheinhardt at outlook.com
Sat Jul 6 21:27:26 EEST 2024
Rémi Denis-Courmont:
> Le lauantaina 6. heinäkuuta 2024, 19.20.33 EEST Andreas Rheinhardt a écrit :
>> Rémi Denis-Courmont:
>>> Le lauantaina 6. heinäkuuta 2024, 18.23.00 EEST Andreas Rheinhardt a écrit
> :
>>>>> static void dct_unquantize_h263_inter_c(MpegEncContext *s,
>>>>>
>>>>> int16_t *block, int n, int qscale)
>>>>>
>>>>> {
>>>>>
>>>>> - int i, level, qmul, qadd;
>>>>> + int qmul = qscale << 1;
>>>>> + int qadd = (qscale - 1) | 1;
>>>>>
>>>>> int nCoeffs;
>>>>>
>>>>> av_assert2(s->block_last_index[n]>=0);
>>>>>
>>>>> - qadd = (qscale - 1) | 1;
>>>>> - qmul = qscale << 1;
>>>>> -
>>>>>
>>>>> nCoeffs= s->inter_scantable.raster_end[ s->block_last_index[n] ];
>>>>>
>>>>> -
>>>>> - for(i=0; i<=nCoeffs; i++) {
>>>>> - level = block[i];
>>>>> - if (level) {
>>>>> - if (level < 0) {
>>>>> - level = level * qmul - qadd;
>>>>> - } else {
>>>>> - level = level * qmul + qadd;
>>>>> - }
>>>>> - block[i] = level;
>>>>> - }
>>>>> - }
>>>>> + s->h263dsp.h263_dct_unquantize_inter(block, nCoeffs, qmul, qadd);
>>>>
>>>> This adds an indirection. I have asked you to actually benchmark this
>>>> code (and not only the DSP function you add), but you never did.
>>>
>>> I already pointed out previously that this is the way this project does
>>> DSP
>>> code. Certainly it would be nice to hard-code the path when there is only
>>> one possible. This is often the case on Armv8 notably, and of course on
>>> platforms without optimisations.
>>>
>>> But that's a general problem way beyond the scope of this patchset. We
>>> always add indirect function calls in this sort of situation, and I don't
>>> see why I would have duty to benchmark it, so I am going to ignore this.
>>
>> You have a duty to benchmark it because you add it where it wasn't before.
>
> I don't recall other people benchmarking the indirect branch they've added
> previously for other DSP code. Recent examples include VVC and FLAC.
> Rightfully so, because there is not really an alternative anyway. Even GNU
> IFUNCs and Glibc alternative libraries internally use an indirect branch
> (hidden in PLT/GOT), and FFmpeg can't self-patch at load-time like the Linux
> kernel does, nor can it generate dynamic PLT entries with direct branches.
>
> Also if an indirect call is unacceptable, then how come the calling code is
> itself an indirect call and for abstraction rather than performance.
I did not even say that it is unacceptable. Merely that it should be
benched.
>
> Your request is completely arbitrary here. Yes, there is already an indirect
> call close up, and so? I'm not trying to clean MpegEncContext here, only
> trying to add one function to checkasm, RVV and (with James' work) post-MMX
> x86.
>
> Lastly, you don't even specify what benchmark to run. Comparing something
> against nothing is, as my manager would say, pointless, since the relative
> overhead ought to be an approximation of infinity (in practice, you end up
> measuring the overhead of the benchmarking code instead).
You shall compare the function you are modifying, namely
dct_unquantize_h263_(intra|inter)_c.
- Andreas
More information about the ffmpeg-devel
mailing list