[FFmpeg-devel] [PATCH] lavc/aarch64: add a few SIMD function for AAC PS
James Almer
jamrial at gmail.com
Thu May 25 19:22:22 EEST 2017
On 5/25/2017 12:50 PM, Clément Bœsch wrote:
> ---
>
> This is still not benchmarked (written and verified with qemu).
>
> I typically wrote an alternative implementation for
> stereo_interpolate[0] which needs to be compared with the current one:
>
> function ff_ps_stereo_interpolate_neon, export=1
> ld1 {v0.4S}, [x2]
> ld1 {v1.4S}, [x3]
> 1:
> ld1 {v2.2S}, [x0]
> ld1 {v3.2S}, [x1]
> fadd v0.4S, v0.4S, v1.4S
> fmul v4.2S, v2.2S, v0.S[0]
> fmul v5.2S, v2.2S, v0.S[1]
> fmla v4.2S, v3.2S, v0.S[2]
> fmla v5.2S, v3.2S, v0.S[3]
> st1 {v4.2S}, [x0], #8
> st1 {v5.2S}, [x1], #8
> subs w4, w4, #1
> b.gt 1b
> ret
> endfunc
>
> I don't know which is faster. For now, the current version follows the
> logic I used in stereo_interpolate[1] (the ipdopd one). It's doing less
> mult operations, but more shuffling.
>
> A 3rd alternative would be possible if it was possible to assume len % 2
> was always true (allowing overreading and overwriting by one more entry
> basically). Currently, this is not the case.
>
> Speaking of ipdopd, the factors table and the ext may be clumsy.
> ---
[...]
> +function ff_ps_stereo_interpolate_ipdopd_neon, export=1
> + movrel x5, ipdopd_factors
> + ld1 {v20.4S}, [x5]
> + ld1 {v0.4S,v1.4S}, [x2]
> + ld1 {v6.4S,v7.4S}, [x3]
> +1:
> + ld1 {v2.2S}, [x0]
> + ld1 {v3.2S}, [x1]
> + dup v2.2D, v2.D[0]
> + dup v3.2D, v3.D[0]
> + fadd v0.4S, v0.4S, v6.4S
> + fadd v1.4S, v1.4S, v7.4S
> + zip1 v16.4S, v0.4S, v0.4S
> + zip2 v17.4S, v0.4S, v0.4S
> + zip1 v18.4S, v1.4S, v1.4S
> + zip2 v19.4S, v1.4S, v1.4S
> + fmul v4.4S, v2.4S, v16.4S
> + fmla v4.4S, v3.4S, v17.4S
> + ext v2.16B, v2.16B, v2.16B, #4
> + ext v3.16B, v3.16B, v3.16B, #4
> + fmul v5.4S, v2.4S, v18.4S
> + fmla v5.4S, v3.4S, v19.4S
> + fmla v4.4S, v5.4S, v20.4S
You could make ipdopd_factors be 0, INT32_MIN, 0, INT32_MIN then replace
the fmla with eor + fadd.
No idea if that will actually be faster, though.
More information about the ffmpeg-devel
mailing list