[FFmpeg-devel] [PATCH] tx_float_neon: Do not access outside stack.
Rémi Denis-Courmont
remi at remlab.net
Sun Oct 9 17:11:04 EEST 2022
Le sunnuntaina 9. lokakuuta 2022, 16.14.47 EEST Reimar Döffinger a écrit :
> Use load/store instructions that modify sp to save
> registers to stack, like it is done for all other
> functions.
> At least valgrind complains about the current code.
> ---
> libavutil/aarch64/tx_float_neon.S | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/libavutil/aarch64/tx_float_neon.S
> b/libavutil/aarch64/tx_float_neon.S index 4126c3b812..4be93cc963 100644
> --- a/libavutil/aarch64/tx_float_neon.S
> +++ b/libavutil/aarch64/tx_float_neon.S
> @@ -866,10 +866,10 @@ FFT16_FN ns_float, 1
>
> .macro FFT32_FN name, no_perm
> function ff_tx_fft32_\name\()_neon, export=1
> - stp d8, d9, [sp, #-16]
> - stp d10, d11, [sp, #-32]
> - stp d12, d13, [sp, #-48]
> - stp d14, d15, [sp, #-64]
> + stp d8, d9, [sp, #-16]!
> + stp d10, d11, [sp, #-16]!
> + stp d12, d13, [sp, #-16]!
> + stp d14, d15, [sp, #-16]!
While this fixes the ABI violation, it introduces multiple data dependencies on
stack pointer due to write-back.
The idiomatic way to do this is to allocate the entire needed stack space in
the first store / last load, and use positive offsets elsewhence.
>
> LOAD_SUBADD
> SETUP_SR_RECOMB 32, x7, x8, x9
> @@ -911,10 +911,10 @@ function ff_tx_fft32_\name\()_neon, export=1
> zip2 v31.2d, v11.2d, v15.2d
> st1 { v28.4s, v29.4s, v30.4s, v31.4s }, [x1]
>
> - ldp d14, d15, [sp, #-64]
> - ldp d12, d13, [sp, #-48]
> - ldp d10, d11, [sp, #-32]
> - ldp d8, d9, [sp, #-16]
> + ldp d14, d15, [sp], #16
> + ldp d12, d13, [sp], #16
> + ldp d10, d11, [sp], #16
> + ldp d8, d9, [sp], #16
>
> ret
> endfunc
--
Rémi Denis-Courmont
http://www.remlab.net/
More information about the ffmpeg-devel
mailing list