[FFmpeg-devel] [PATCH] aarch64/h26x: Add put_hevc_pel_bi_w_pixels
Martin Storsjö
martin at martin.st
Fri Apr 25 11:29:05 EEST 2025
On Wed, 23 Apr 2025, Zhao Zhili wrote:
> From: Zhao Zhili <zhilizhao at tencent.com>
>
> On rpi5 (A76):
>
> put_hevc_pel_bi_w_pixels4_8_c: 90.0 ( 1.00x)
> put_hevc_pel_bi_w_pixels4_8_neon: 34.1 ( 2.64x)
> put_hevc_pel_bi_w_pixels6_8_c: 188.3 ( 1.00x)
> put_hevc_pel_bi_w_pixels6_8_neon: 73.5 ( 2.56x)
> put_hevc_pel_bi_w_pixels8_8_c: 327.1 ( 1.00x)
> put_hevc_pel_bi_w_pixels8_8_neon: 75.8 ( 4.32x)
> put_hevc_pel_bi_w_pixels12_8_c: 728.8 ( 1.00x)
> put_hevc_pel_bi_w_pixels12_8_neon: 186.1 ( 3.92x)
> put_hevc_pel_bi_w_pixels16_8_c: 1288.1 ( 1.00x)
> put_hevc_pel_bi_w_pixels16_8_neon: 268.5 ( 4.80x)
> put_hevc_pel_bi_w_pixels24_8_c: 2855.5 ( 1.00x)
> put_hevc_pel_bi_w_pixels24_8_neon: 723.8 ( 3.95x)
> put_hevc_pel_bi_w_pixels32_8_c: 5095.3 ( 1.00x)
> put_hevc_pel_bi_w_pixels32_8_neon: 1165.0 ( 4.37x)
> put_hevc_pel_bi_w_pixels48_8_c: 11521.5 ( 1.00x)
> put_hevc_pel_bi_w_pixels48_8_neon: 2856.0 ( 4.03x)
> put_hevc_pel_bi_w_pixels64_8_c: 21020.5 ( 1.00x)
> put_hevc_pel_bi_w_pixels64_8_neon: 4699.1 ( 4.47x)
> ---
> libavcodec/aarch64/h26x/dsp.h | 5 +
> libavcodec/aarch64/h26x/epel_neon.S | 373 ++++++++++++++++++++++
> libavcodec/aarch64/hevcdsp_init_aarch64.c | 13 +
> 3 files changed, 391 insertions(+)
This looks good overall, thanks!
It's quite regrettable how many duplicates of near-identical functions
there are in the h26x qpel/epel code; ideally we should be able to
produce most of these function variants with some sort of template instead
of having them all duplicated (with minor style differences).
// Martin
More information about the ffmpeg-devel
mailing list