[FFmpeg-devel] [PATCH] avcodec/x86/vp9: Add AVX-512ICL for 16x16 and 32x32 8bpc inverse transforms
Henrik Gramner
henrik at gramner.com
Sat May 17 01:59:31 EEST 2025
Placed in a new separate file as the existing combined MMX/SSE/AVX
file is humongous and takes forever to assemble as is.
This adds ~16 KiB of .text. The existing 8bpc asm is ~240 KiB of which
the corresponding AVX2 functions makes up ~42 KiB.
Tested to pass FATE on Linux and Windows.
Checkasm numbers vs AVX2 on Zen 5 (Strix Halo):
vp9_inv_adst_adst_16x16_sub16_add_8_avx2: 209.3
vp9_inv_adst_adst_16x16_sub16_add_8_avx512icl: 99.5
vp9_inv_adst_dct_16x16_sub16_add_8_avx2: 165.2
vp9_inv_adst_dct_16x16_sub16_add_8_avx512icl: 89.7
vp9_inv_dct_adst_16x16_sub16_add_8_avx2: 165.9
vp9_inv_dct_adst_16x16_sub16_add_8_avx512icl: 87.7
vp9_inv_dct_dct_16x16_sub16_add_8_avx2: 121.3
vp9_inv_dct_dct_16x16_sub16_add_8_avx512icl: 79.2
vp9_inv_dct_dct_32x32_sub32_add_8_avx2: 745.5
vp9_inv_dct_dct_32x32_sub32_add_8_avx512icl: 285.5
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vp9_itx_avx512.patch
Type: application/octet-stream
Size: 73066 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250517/801c04e8/attachment.obj>
More information about the ffmpeg-devel
mailing list