[FFmpeg-devel] [PATCH] avcodec/x86/vp9: Add AVX-512ICL for 16x16 and 32x32 8bpc inverse transforms

Henrik Gramner henrik at gramner.com
Sat May 17 01:59:31 EEST 2025


Placed in a new separate file as the existing combined MMX/SSE/AVX
file is humongous and takes forever to assemble as is.

This adds ~16 KiB of .text. The existing 8bpc asm is ~240 KiB of which
the corresponding AVX2 functions makes up ~42 KiB.

Tested to pass FATE on Linux and Windows.

Checkasm numbers vs AVX2 on Zen 5 (Strix Halo):
  vp9_inv_adst_adst_16x16_sub16_add_8_avx2:        209.3
  vp9_inv_adst_adst_16x16_sub16_add_8_avx512icl:    99.5

  vp9_inv_adst_dct_16x16_sub16_add_8_avx2:         165.2
  vp9_inv_adst_dct_16x16_sub16_add_8_avx512icl:     89.7

  vp9_inv_dct_adst_16x16_sub16_add_8_avx2:         165.9
  vp9_inv_dct_adst_16x16_sub16_add_8_avx512icl:     87.7

  vp9_inv_dct_dct_16x16_sub16_add_8_avx2:          121.3
  vp9_inv_dct_dct_16x16_sub16_add_8_avx512icl:      79.2

  vp9_inv_dct_dct_32x32_sub32_add_8_avx2:          745.5
  vp9_inv_dct_dct_32x32_sub32_add_8_avx512icl:     285.5
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vp9_itx_avx512.patch
Type: application/octet-stream
Size: 73066 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250517/801c04e8/attachment.obj>


More information about the ffmpeg-devel mailing list