[FFmpeg-devel] [PATCH 1/3] riscv: add CPU flags for the RISC-V Vector extension

Lynne dev at lynne.ee
Sun Sep 4 09:39:36 EEST 2022


Sep 4, 2022, 07:41 by remi at remlab.net:

> Le sunnuntaina 4. syyskuuta 2022, 0.38.32 EEST Lynne a écrit :
>
>> I need to know the length in C, not assembly.
>>
>
> There may be some corner cases where that makes sense, but typically it 
> doesn't. Even if you're dealing in fixed-size macro blocks, you should leverage 
> the larger vectors to unroll and process multiple macro blocks in parallel.
>

Some aspects of a split-radix FFT work better if you know how
much you could fit into a register upfront. In particular, doing
the tail, which consists of 2 equal length transforms. On AVX
we interleave the coefficients from 2x4pt transforms during
lookups since we can do them simultaneously and save on
shuffles. Doing them individually wouldn't be as efficient.
Since interleaving is done during the permute step, we have
to know from C how much to interleave.
Of course if you switched away from a split-radix algorithm (X+X/2+X/2),
you could have a very simple 100-line FFT if you had arbitrarily
long vectors (or the pretense of such), but if you didn't have
the hardware to back that up, the penalty for using a suboptimal
algorithm wouldn't be worth it.


> And besides, how do you want to get the value if not with assembler? This is 
> currently not found in ELF HWCAP and probably never will be.
>

Sucks, knowing how wide the units are is as important as
knowing how much L1 cache you have for me.


> I disagree. There are currently no means to negotiate a vector length with the 
> OS, so that seems highly premature. And even if there was such a mechanism, 
> it's simply much faster to call VSETVL in an inline assembler macro where 
> needed than to compute the whole set of CPU flags.
>

Guess that's what I'll have to do.In due time anyway, who knows how many years it'll be until
a cheap enough device appears with vector support that
doesn't merely do what SVE2 devices did by reusing old NEON
unit designs.



More information about the ffmpeg-devel mailing list