[FFmpeg-devel] [aarch64] improve performance of ff_hscale_8_to_15_neon

Tue Nov 26 00:18:21 EET 2019

Hello,

On Mon, Nov 25, 2019, at 22:59, Sebastian Pop wrote:
> This patch implements ff_hscale_8_to_15_neon with NEON fused multiply accumulate
> and bumps the vectorization factor from 2 to 4. I have seen speedups up to 15%
> on Graviton A1 instances based on A-72 cpus.

Why adding a new version, in intrinsics, instead of changing the existing implementation?

Best,

--
Jean-Baptiste Kempf - President
+33 672 704 734