[FFmpeg-devel] [PATCHv5 1/4] lavc/h263dsp: add DCT dequantisation functions
Rémi Denis-Courmont
remi at remlab.net
Wed Jun 12 21:10:43 EEST 2024
Le keskiviikkona 12. kesäkuuta 2024, 20.40.37 EEST James Almer a écrit :
> On 6/12/2024 1:47 AM, Rémi Denis-Courmont wrote:
> > Note that optimised implementations of these functions will be taken
> > into actual use only if MpegEncContext.dct_unquantize_h263_{inter,intra}
> > are *not* overloaded by existing optimisations.
> >
> > ---
> > This adds the plus ones back, saving two branch instructions in C and
> > one in assembler (at the cost of two unconditional adds).
>
> See my reply in the previous version. Not sure if it will help with this.
We can of course avoid the branches - this version avoids the branches, as did
the initial versions. In C (and in RVV), we can't avoid incrementing the
pointer and a counter variable.
If you change the loop like yuo suggest:
for (size_t i = 1; i <= nCoeffs; i++) {
int level = block[i];
if (level) {
if (level < 0)
level = level * qmul - qadd;
else
level = level * qmul + qadd;
block[i] = level;
}
}
... at best, an optimising compiler will reinterpret it to:
if (nCoeffs >= 1) {
block++;
end = block + nCoeffs;
loop:
level = *block;
if (level) {
tmp = level * qmul;
if (level < 0)
tmp -= qadd;
else
tmp += qadd;
*(block++) = tmp;
}
if (block <= end)
goto loop;
}
Or perhaps the compiler will keep an explicit counter, which is even worse.
This does not save branches, nor increments. It just looks like it because of
the syntactic sugar that is the for() loop. In reality, this only duplicates
code (as we can no longer share between inter/intra).
--
レミ・デニ-クールモン
http://www.remlab.net/
More information about the ffmpeg-devel
mailing list