[FFmpeg-devel] [PATCH 0/5] RISC-V: Improve H264 decoding performance using RVV intrinsic

Lynne dev at lynne.ee
Tue May 9 18:47:55 EEST 2023


May 9, 2023, 11:51 by arnie.chang at sifive.com:

> We are submitting a set of patches that significantly improve H.264 decoding performance
> by utilizing RVV intrinsic code. The average speedup(FPS) achieved by these patches is more than 2x,
> as experimented on 720P videos running on an internal FPGA board.
>
> Patch1: add support for RVV intrinsic code in the configure file
> Patch2: optimize chroma motion compensation
> Patch3: optimize luma motion compensation
> Patch4: optimize dsp functions, such as IDCT, in-loop filtering, and weighed filtering
> Patch5: optimize intra prediction
>
> Arnie Chang (5):
>  configure: Add detection of RISC-V vector intrinsic support
>  lavc/h264chroma: Add vectorized implementation of chroma MC for RISC-V
>  lavc/h264qpel: Add vectorized implementation of luma MC for RISC-V
>  lavc/h264dsp: Add vectorized implementation of DSP functions for
>  RISC-V
>  lavc/h264pred: Add vectorized implementation of intra prediction for
>  RISC-V
>

Could you rewrite this in asm instead? I'd like for risc-v to have the same
policy like we do for arm - no intrinsics. There's a long list of reasons we
don't use intrinsics which I won't get into.
Just a few days ago, I discovered that our PPC intrinsics were quite badly
performing due to compiler issues, in some cases, 500x slower than C.
Also, we don't care about overall speedup. We have checkasm --bench
to measure the per-function speedup over C.


More information about the ffmpeg-devel mailing list