[FFmpeg-devel] [PATCH 1/4] lavc/vp8dsp: R-V V 256 bilin,epel

Wed Jul 31 17:11:18 EEST 2024

Le tiistaina 30. heinäkuuta 2024, 20.57.28 EEST flow gg a écrit :
> From my understanding, moving from supporting only 128b to adding 256b
> versions can simultaneously improve LMUL and solve some issues related to
> insufficient vector registers (vvc, vp9).

To the contrary, if vectors are too short to process a macroblock in a single 
round, then there should be a loop with maximum LMUL, and the code should be 
the same for all vector length. That is just normal textbook RVV coding style. 
There should *not* be vector length specialisation since the code can be 
shared.

> If we continue to support 512, 1024, ..., it almost exclusively improves
> LMUL.

I don't think so. Even more so than 256-bit hardware, 512-bit and 1024-bit 
hardware really _needs_ to short-circuit vector processing based on VL and not 
simply follow LMUL.

> Therefore, 256b is the most worthwhile addition, and we can skip
> adding 512b, 1024b, etc.
> 
> Additionally, even though longer hardware will continually be developed,
> the most used will probably still be 128b and 256b.

I wouldn't be so sure. Realistically, lower-end SoCs decode video with DSPs. 
So video decoder vector optimisations are mainly for the server side, and 
that's exactly where larger vector sizes are most likely (e.g. AVX-512).

> If someone complains that FFmpeg's RVV doesn't support 1024b well, it can
> be said that it's not just RISC-V that lacks good support.
> However, if the 256b performance is not good, then it seems like an issue
> with RISC-V. :)
> 
> I think maybe we can give some preference to the two smallest lengths?

As I wrote, I am not necessarily against specialising for 256-bit as such. I 
am against:
1) specialising functions that do not really need to be specialised,
2) adding tons of boilerplate (notably in the C code) for it.

-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/