[FFmpeg-cvslog] lavc/vp8dsp: use saturating add/sub for R-V V DC add
Rémi Denis-Courmont
git at videolan.org
Sun Jul 28 18:03:00 EEST 2024
ffmpeg | branch: master | Rémi Denis-Courmont <remi at remlab.net> | Thu Jul 25 17:40:26 2024 +0300| [9b4655c3a145d5d0f315c3bd0a80792f37603c2f] | committer: Rémi Denis-Courmont
lavc/vp8dsp: use saturating add/sub for R-V V DC add
T-Head C908 (cycles):
vp7_idct_dc_add_c: 108.5
vp7_idct_dc_add_rvv_i32: 56.2 (before)
vp7_idct_dc_add_rvv_i32: 47.2 (after)
vp8_idct_dc_add_c: 96.2
vp8_idct_dc_add_rvv_i32: 43.0 (before)
vp8_idct_dc_add_rvv_i32: 34.0 (after)
> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=9b4655c3a145d5d0f315c3bd0a80792f37603c2f
---
libavcodec/riscv/vp8dsp_rvv.S | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S
index 6ff443fbe6..a8b3e239ba 100644
--- a/libavcodec/riscv/vp8dsp_rvv.S
+++ b/libavcodec/riscv/vp8dsp_rvv.S
@@ -172,12 +172,18 @@ func ff_vp78_idct_dc_add_rvv, zve32x
vsetivli zero, 4, e8, mf4, ta, ma
sh zero, (a1)
vlse32.v v8, (a0), a2
- vsetivli zero, 16, e16, m2, ta, ma
- vzext.vf2 v16, v8
- vadd.vx v16, v16, a3
- vmax.vx v16, v16, zero
- vsetvli zero, zero, e8, m1, ta, ma
- vnclipu.wi v8, v16, 0
+ vsetivli zero, 16, e8, m1, ta, ma
+ bgez a3, 1f
+
+ # block[0] < 0
+ neg a3, a3
+ vssubu.vx v8, v8, a3
+ vsetivli zero, 4, e8, mf4, ta, ma
+ vsse32.v v8, (a0), a2
+ ret
+
+1: # block[0] >= 0
+ vsaddu.vx v8, v8, a3
vsetivli zero, 4, e8, mf4, ta, ma
vsse32.v v8, (a0), a2
ret
More information about the ffmpeg-cvslog
mailing list