[FFmpeg-devel] [PATCH 2/4] lavc/mpegvideo: use H263DSP dequant function
Rémi Denis-Courmont
remi at remlab.net
Sat Jul 6 19:47:20 EEST 2024
Le lauantaina 6. heinäkuuta 2024, 19.20.33 EEST Andreas Rheinhardt a écrit :
> Rémi Denis-Courmont:
> > Le lauantaina 6. heinäkuuta 2024, 18.23.00 EEST Andreas Rheinhardt a écrit
:
> >>> static void dct_unquantize_h263_inter_c(MpegEncContext *s,
> >>>
> >>> int16_t *block, int n, int qscale)
> >>>
> >>> {
> >>>
> >>> - int i, level, qmul, qadd;
> >>> + int qmul = qscale << 1;
> >>> + int qadd = (qscale - 1) | 1;
> >>>
> >>> int nCoeffs;
> >>>
> >>> av_assert2(s->block_last_index[n]>=0);
> >>>
> >>> - qadd = (qscale - 1) | 1;
> >>> - qmul = qscale << 1;
> >>> -
> >>>
> >>> nCoeffs= s->inter_scantable.raster_end[ s->block_last_index[n] ];
> >>>
> >>> -
> >>> - for(i=0; i<=nCoeffs; i++) {
> >>> - level = block[i];
> >>> - if (level) {
> >>> - if (level < 0) {
> >>> - level = level * qmul - qadd;
> >>> - } else {
> >>> - level = level * qmul + qadd;
> >>> - }
> >>> - block[i] = level;
> >>> - }
> >>> - }
> >>> + s->h263dsp.h263_dct_unquantize_inter(block, nCoeffs, qmul, qadd);
> >>
> >> This adds an indirection. I have asked you to actually benchmark this
> >> code (and not only the DSP function you add), but you never did.
> >
> > I already pointed out previously that this is the way this project does
> > DSP
> > code. Certainly it would be nice to hard-code the path when there is only
> > one possible. This is often the case on Armv8 notably, and of course on
> > platforms without optimisations.
> >
> > But that's a general problem way beyond the scope of this patchset. We
> > always add indirect function calls in this sort of situation, and I don't
> > see why I would have duty to benchmark it, so I am going to ignore this.
>
> You have a duty to benchmark it because you add it where it wasn't before.
I don't recall other people benchmarking the indirect branch they've added
previously for other DSP code. Recent examples include VVC and FLAC.
Rightfully so, because there is not really an alternative anyway. Even GNU
IFUNCs and Glibc alternative libraries internally use an indirect branch
(hidden in PLT/GOT), and FFmpeg can't self-patch at load-time like the Linux
kernel does, nor can it generate dynamic PLT entries with direct branches.
Also if an indirect call is unacceptable, then how come the calling code is
itself an indirect call and for abstraction rather than performance.
Your request is completely arbitrary here. Yes, there is already an indirect
call close up, and so? I'm not trying to clean MpegEncContext here, only
trying to add one function to checkasm, RVV and (with James' work) post-MMX
x86.
Lastly, you don't even specify what benchmark to run. Comparing something
against nothing is, as my manager would say, pointless, since the relative
overhead ought to be an approximation of infinity (in practice, you end up
measuring the overhead of the benchmarking code instead).
--
Rémi Denis-Courmont
http://www.remlab.net/
More information about the ffmpeg-devel
mailing list