[FFmpeg-devel] [PATCH 2/7] avcodec/aarch64/mpegvideoencdsp: add neon implementations for pix_sum and pix_norm1
Ramiro Polla
ramiro.polla at gmail.com
Sun Aug 18 23:20:07 EEST 2024
On Sun, Aug 18, 2024 at 10:13 PM Ramiro Polla <ramiro.polla at gmail.com> wrote:
>
> A53 A76
> pix_norm1_c: 519.2 231.5
> pix_norm1_neon: 195.0 ( 2.66x) 44.2 ( 5.24x)
> pix_sum_c: 344.5 242.2
> pix_sum_neon: 119.0 ( 2.89x) 41.7 ( 5.81x)
This new patchset no longer uses unrolled loops. Even though checkasm
reported the unrolled versions to be faster, in a real encoding
use-case linux perf reports that the non-unrolled versions are faster.
More information about the ffmpeg-devel
mailing list