[FFmpeg-devel] [PATCH] lavc/opusdsp: rewrite R-V V postfilter

Rémi Denis-Courmont remi at remlab.net
Thu Nov 2 23:16:18 EET 2023


Le torstaina 2. marraskuuta 2023, 23.07.03 EET Rémi Denis-Courmont a écrit :
> This uses a more traditional approach allowing up processing of up to
> period minus two elements per iteration. This also allows the algorithm
> to work for all and any vector length.
> 
> As the T-Head C908 device under test can load 16 elements loop, there is
> unsurprisingly a little performance drop when the period is minimal and
> the parallelism is capped at 13 elements:
> 
> Before:
> postfilter_15_c:         21222.2
> postfilter_15_rvv_f32:   22007.7
> postfilter_512_c:        20189.7
> postfilter_512_rvv_f32:  22004.2
> postfilter_1022_c:       20189.7
> postfilter_1022_rvv_f32: 22004.2
> 
> After:
> postfilter_15_c:         20189.5
> postfilter_15_rvv_f32:    7057.2
> postfilter_512_c:        20189.5
> postfilter_512_rvv_f32:   5667.2
> postfilter_1022_c:       20192.7
> postfilter_1022_rvv_f32:  5667.2
> ---
>  libavcodec/riscv/opusdsp_init.c | 22 ++-------
>  libavcodec/riscv/opusdsp_rvv.S  | 87 +++++++++++++++------------------
>  2 files changed, 42 insertions(+), 67 deletions(-)
> 
> diff --git a/libavcodec/riscv/opusdsp_init.c
> b/libavcodec/riscv/opusdsp_init.c index 7fde9b1fa8..f5a842a326 100644
> --- a/libavcodec/riscv/opusdsp_init.c
> +++ b/libavcodec/riscv/opusdsp_init.c
> @@ -25,30 +25,14 @@
>  #include "libavutil/riscv/cpu.h"
>  #include "libavcodec/opusdsp.h"
> 
> -void ff_opus_postfilter_rvv_128(float *data, int period, float *g, int
> len); -void ff_opus_postfilter_rvv_256(float *data, int period, float *g,
> int len); -void ff_opus_postfilter_rvv_512(float *data, int period, float
> *g, int len); -void ff_opus_postfilter_rvv_1024(float *data, int period,
> float *g, int len); +void ff_opus_postfilter_rvv(float *data, int period,
> float *g, int len);
> 
>  av_cold void ff_opus_dsp_init_riscv(OpusDSP *d)
>  {
>  #if HAVE_RVV
>      int flags = av_get_cpu_flags();
> 
> -    if (flags & AV_CPU_FLAG_RVV_F32)
> -        switch (ff_get_rv_vlenb()) {
> -        case 16:
> -            d->postfilter = ff_opus_postfilter_rvv_128;
> -            break;
> -        case 32:
> -            d->postfilter = ff_opus_postfilter_rvv_256;
> -            break;
> -        case 64:
> -            d->postfilter = ff_opus_postfilter_rvv_512;
> -            break;
> -        case 128:
> -            d->postfilter = ff_opus_postfilter_rvv_512;
> -            break;
> -        }
> +    if ((flags & AV_CPU_FLAG_RVV_F32) && (flags & AV_CPU_FLAG_RVB_ADDR))

Will add check for Zbb for the MIN instruction.

> +        d->postfilter = ff_opus_postfilter_rvv;
>  #endif
>  }

-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/





More information about the ffmpeg-devel mailing list