[FFmpeg-devel] [PATCH 3/3] lavc/aarch64: Add neon implementation for pix_median_abs8

Martin Storsjö martin at martin.st
Sat Sep 17 00:17:07 EEST 2022


On Tue, 13 Sep 2022, Hubert Mazur wrote:

> Provide optimized implementation for pix_median_abs16 function.

Forgot to update this part of the commit message here too.

> Performance comparison tests are shown below.
> - median_sad_1_c: 273.7
> - median_sad_1_neon: 98.2
>
> Benchmarks and tests run with checkasm tool on AWS Graviton 3.
>
> Signed-off-by: Hubert Mazur <hum at semihalf.com>
> ---
> libavcodec/aarch64/me_cmp_init_aarch64.c |  3 ++
> libavcodec/aarch64/me_cmp_neon.S         | 65 ++++++++++++++++++++++++
> 2 files changed, 68 insertions(+)

The same comments as for patch 1/3; looks reasonble, a bunch of leftover 
mov instructions which I don't see how they'd be necessary, and please 
avoid the extra single-lane handling and just do plain vector operations, 
and extract the single lane at the end.

// Martin



More information about the ffmpeg-devel mailing list