[FFmpeg-cvslog] swscale: aarch64: Fix yuv2rgb with negative strides

Martin Storsjö git at videolan.org
Fri Nov 4 14:32:38 EET 2022


ffmpeg | branch: release/4.2 | Martin Storsjö <martin at martin.st> | Tue Oct 25 13:13:34 2022 +0300| [9d5450b514217b8aca408652d17a2ff00a9ffa51] | committer: Martin Storsjö

swscale: aarch64: Fix yuv2rgb with negative strides

Treat the 32 bit stride registers as signed.

Alternatively, we could make the stride arguments ptrdiff_t instead
of int, and changing all of the assembly to operate on these
registers with their full 64 bit width, but that's a much larger
and more intrusive change (and risks missing some operation, which
would clamp the intermediates to 32 bit still).

Fixes: https://trac.ffmpeg.org/ticket/9985

Signed-off-by: Martin Storsjö <martin at martin.st>
(cherry picked from commit cb803a0072cb98945dcd3f1660bd2a975650ce42)
Signed-off-by: Martin Storsjö <martin at martin.st>

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=9d5450b514217b8aca408652d17a2ff00a9ffa51
---

 libswscale/aarch64/yuv2rgb_neon.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libswscale/aarch64/yuv2rgb_neon.S b/libswscale/aarch64/yuv2rgb_neon.S
index b7446aa105..10bd1f7480 100644
--- a/libswscale/aarch64/yuv2rgb_neon.S
+++ b/libswscale/aarch64/yuv2rgb_neon.S
@@ -118,8 +118,8 @@
 .endm
 
 .macro increment_yuv422p
-    add                 x6,  x6,  w7, UXTW                              // srcU += incU
-    add                 x13, x13, w14, UXTW                             // srcV += incV
+    add                 x6,  x6,  w7, SXTW                              // srcU += incU
+    add                 x13, x13, w14, SXTW                             // srcV += incV
 .endm
 
 .macro compute_rgba r1 g1 b1 a1 r2 g2 b2 a2
@@ -188,8 +188,8 @@ function ff_\ifmt\()_to_\ofmt\()_neon, export=1
     st4                 {v16.8B,v17.8B,v18.8B,v19.8B}, [x2], #32
     subs                w8, w8, #16                                     // width -= 16
     b.gt                2b
-    add                 x2, x2, w3, UXTW                                // dst  += padding
-    add                 x4, x4, w5, UXTW                                // srcY += paddingY
+    add                 x2, x2, w3, SXTW                                // dst  += padding
+    add                 x4, x4, w5, SXTW                                // srcY += paddingY
     increment_\ifmt
     subs                w1, w1, #1                                      // height -= 1
     b.gt                1b



More information about the ffmpeg-cvslog mailing list