[FFmpeg-devel] [aarch64] improve performance of ff_hscale_8_to_15_neon
Jean-Baptiste Kempf
jb at videolan.org
Tue Nov 26 00:18:21 EET 2019
Hello,
On Mon, Nov 25, 2019, at 22:59, Sebastian Pop wrote:
> This patch implements ff_hscale_8_to_15_neon with NEON fused multiply accumulate
> and bumps the vectorization factor from 2 to 4. I have seen speedups up to 15%
> on Graviton A1 instances based on A-72 cpus.
Why adding a new version, in intrinsics, instead of changing the existing implementation?
Best,
--
Jean-Baptiste Kempf - President
+33 672 704 734
More information about the ffmpeg-devel
mailing list