[FFmpeg-devel] [PATCH 2/4] lavc/aarch64: Provide neon implementation of nsse8
Martin Storsjö
martin at martin.st
Wed Sep 28 12:08:34 EEST 2022
On Mon, 26 Sep 2022, Grzegorz Bernacki wrote:
> Add vectorized implementation of nsse8 function.
>
> Performance comparison tests are shown below.
> - nsse_1_c: 256.0
> - nsse_1_neon: 82.7
>
> Benchmarks and tests run with checkasm tool on AWS Graviton 3.
>
> Signed-off-by: Grzegorz Bernacki <gjb at semihalf.com>
> ---
> libavcodec/aarch64/me_cmp_init_aarch64.c | 15 ++++
> libavcodec/aarch64/me_cmp_neon.S | 99 ++++++++++++++++++++++++
> 2 files changed, 114 insertions(+)
Looks reasonable to me, but do check to make sure there's no tabs.
// Martin
More information about the ffmpeg-devel
mailing list