[FFmpeg-cvslog] aarch64: vp9mc: Load only 12 pixels in the 4 pixel wide horizontal filter
Janne Grunau
git at videolan.org
Sat Jan 4 14:33:39 EET 2025
ffmpeg | branch: release/3.4 | Janne Grunau <janne-ffmpeg at jannau.net> | Fri Jan 3 01:54:38 2025 +0100| [180f8216cdb643deff9255f3fa6cc09d517e52d4] | committer: Ronald S. Bultje
aarch64: vp9mc: Load only 12 pixels in the 4 pixel wide horizontal filter
This reduces the amount the horizontal filters read beyond the filter
width to a consistent 1 pixel. The data is not used so this is usually
not noticeable. It becomes a problem when the application allocates
frame buffers only for the aligned picture size and the end of it is at
a page boundary. This happens for picture sizes which are a multiple of
the page size like 1280x640. The frame buffer allocation is based on
its most likely done via mmap + MAP_ANONYMOUS so start and end of the
buffer are page aligned and the previous and next page are not
necessarily mapped.
Under these conditions like seen by Firefox a read beyond the end of the
buffer results in a segfault.
After the over-read is reduced to a single pixel it's reasonable to use
VP9's emulated edge motion compensation for this.
Fixes: https://bugzilla.mozilla.org/show_bug.cgi?id=1881185
Signed-off-by: Janne Grunau <janne-ffmpeg at jannau.net>
Signed-off-by: Ronald S. Bultje <rsbultje at gmail.com>
(cherry picked from commit 430c38f698a65d597e863330810b05e083682be6)
> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=180f8216cdb643deff9255f3fa6cc09d517e52d4
---
libavcodec/aarch64/vp9mc_neon.S | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/libavcodec/aarch64/vp9mc_neon.S b/libavcodec/aarch64/vp9mc_neon.S
index f67624ca04..7cdcd675ed 100644
--- a/libavcodec/aarch64/vp9mc_neon.S
+++ b/libavcodec/aarch64/vp9mc_neon.S
@@ -260,6 +260,9 @@ function \type\()_8tap_\size\()h_\idx1\idx2
// reduced dst stride
.if \size >= 16
sub x1, x1, x5
+.elseif \size == 4
+ add x12, x2, #8
+ add x13, x7, #8
.endif
// size >= 16 loads two qwords and increments x2,
// for size 4/8 it's enough with one qword and no
@@ -278,9 +281,14 @@ function \type\()_8tap_\size\()h_\idx1\idx2
.if \size >= 16
ld1 {v4.8b, v5.8b, v6.8b}, [x2], #24
ld1 {v16.8b, v17.8b, v18.8b}, [x7], #24
-.else
+.elseif \size == 8
ld1 {v4.8b, v5.8b}, [x2]
ld1 {v16.8b, v17.8b}, [x7]
+.else // \size == 4
+ ld1 {v4.8b}, [x2]
+ ld1 {v16.8b}, [x7]
+ ld1 {v5.s}[0], [x12], x3
+ ld1 {v17.s}[0], [x13], x3
.endif
uxtl v4.8h, v4.8b
uxtl v5.8h, v5.8b
More information about the ffmpeg-cvslog
mailing list