[FFmpeg-cvslog] lavc/vp8dsp: use saturating add/sub for R-V V DC add

Rémi Denis-Courmont git at videolan.org
Sun Jul 28 18:03:00 EEST 2024


ffmpeg | branch: master | Rémi Denis-Courmont <remi at remlab.net> | Thu Jul 25 17:40:26 2024 +0300| [9b4655c3a145d5d0f315c3bd0a80792f37603c2f] | committer: Rémi Denis-Courmont

lavc/vp8dsp: use saturating add/sub for R-V V DC add

T-Head C908 (cycles):
vp7_idct_dc_add_c:          108.5
vp7_idct_dc_add_rvv_i32:     56.2 (before)
vp7_idct_dc_add_rvv_i32:     47.2 (after)
vp8_idct_dc_add_c:           96.2
vp8_idct_dc_add_rvv_i32:     43.0 (before)
vp8_idct_dc_add_rvv_i32:     34.0 (after)

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=9b4655c3a145d5d0f315c3bd0a80792f37603c2f
---

 libavcodec/riscv/vp8dsp_rvv.S | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S
index 6ff443fbe6..a8b3e239ba 100644
--- a/libavcodec/riscv/vp8dsp_rvv.S
+++ b/libavcodec/riscv/vp8dsp_rvv.S
@@ -172,12 +172,18 @@ func ff_vp78_idct_dc_add_rvv, zve32x
         vsetivli   zero, 4, e8, mf4, ta, ma
         sh         zero, (a1)
         vlse32.v   v8, (a0), a2
-        vsetivli   zero, 16, e16, m2, ta, ma
-        vzext.vf2  v16, v8
-        vadd.vx    v16, v16, a3
-        vmax.vx    v16, v16, zero
-        vsetvli    zero, zero, e8, m1, ta, ma
-        vnclipu.wi v8, v16, 0
+        vsetivli   zero, 16, e8, m1, ta, ma
+        bgez       a3, 1f
+
+        # block[0] < 0
+        neg        a3, a3
+        vssubu.vx  v8, v8, a3
+        vsetivli   zero, 4, e8, mf4, ta, ma
+        vsse32.v   v8, (a0), a2
+        ret
+
+1:      # block[0] >= 0
+        vsaddu.vx  v8, v8, a3
         vsetivli   zero, 4, e8, mf4, ta, ma
         vsse32.v   v8, (a0), a2
         ret



More information about the ffmpeg-cvslog mailing list