[FFmpeg-devel] [PATCH 6/6] lavc/takdsp: R-V V decorrelate_sm

Rémi Denis-Courmont remi at remlab.net
Thu Dec 21 18:07:55 EET 2023


Le maanantaina 18. joulukuuta 2023, 17.16.27 EET flow gg a écrit :
> C908:
> decorrelate_sm_c: 130.0
> decorrelate_sm_rvv_i32: 43.7

+
+func ff_decorrelate_sm_rvv, zve32x
+1:
+        vsetvli  t0, a2, e32, m8, ta, ma
+        vle32.v  v0, (a0)
+        sub a2,  a2, t0
+        vle32.v  v8, (a1)
+        vsra.vi  v16, v8, 1

You should load v8 first, since it is used as input before v0.

+        vsub.vv  v0, v0, v16
+        vse32.v  v0, (a0)
+        sh2add   a0, t0, a0
+        vadd.vv  v0, v0, v8

You can use VSSRA, and then VADD won't need to depend on the output of VSUB.

+        vse32.v  v0, (a1)
+        sh2add   a1, t0, a1
+        bnez     a2, 1b
+        ret
+endfunc

-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/





More information about the ffmpeg-devel mailing list