[FFmpeg-devel] [PATCH 1/3] aarch64: vp9mc: Load only 12 pixels in the 4 pixel wide horizontal filter

Janne Grunau janne-ffmpeg at jannau.net
Thu Dec 19 23:12:21 EET 2024


This reduces the amount the horizontal filters read beyond the filter
width to a consistent 1 pixel. The data is not used so this is usually
not noticeable. It becomes a problem when the application allocates
frame buffers only for the aligned picture size and the end of it is at
a page boundary. This happens for picture sizes which are a multiple of
the page size like 1280x640. The frame buffer allocation is based on
its most likely done via mmap + MAP_ANONYMOUS so start and end of the
buffer are page aligned and the previous and next page are not
necessarily mapped.
Under these conditions like seen by Firefox a read beyond the end of the
buffer results in a segfault.
After the over-read is reduced to a single pixel it's reasonable to use
VP9's emulated edge motion compensation for this.

Fixes: https://bugzilla.mozilla.org/show_bug.cgi?id=1881185
Signed-off-by: Janne Grunau <janne-ffmpeg at jannau.net>
---
 libavcodec/aarch64/vp9mc_neon.S | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/libavcodec/aarch64/vp9mc_neon.S b/libavcodec/aarch64/vp9mc_neon.S
index abf2bae9db07..38f44ca56d0d 100644
--- a/libavcodec/aarch64/vp9mc_neon.S
+++ b/libavcodec/aarch64/vp9mc_neon.S
@@ -230,6 +230,9 @@ function \type\()_8tap_\size\()h_\idx1\idx2
         // reduced dst stride
 .if \size >= 16
         sub             x1,  x1,  x5
+.elseif \size == 4
+        add             x12, x2,  #8
+        add             x13, x7,  #8
 .endif
         // size >= 16 loads two qwords and increments x2,
         // for size 4/8 it's enough with one qword and no
@@ -248,9 +251,14 @@ function \type\()_8tap_\size\()h_\idx1\idx2
 .if \size >= 16
         ld1             {v4.8b,  v5.8b,  v6.8b},  [x2], #24
         ld1             {v16.8b, v17.8b, v18.8b}, [x7], #24
-.else
+.elseif \size == 8
         ld1             {v4.8b,  v5.8b},  [x2]
         ld1             {v16.8b, v17.8b}, [x7]
+.else // \size == 4
+        ld1             {v4.8b},  [x2]
+        ld1             {v16.8b}, [x7]
+        ld1             {v5.s}[0],  [x12], x3
+        ld1             {v17.s}[0], [x13], x3
 .endif
         uxtl            v4.8h,  v4.8b
         uxtl            v5.8h,  v5.8b
-- 
2.45.2



More information about the ffmpeg-devel mailing list