[FFmpeg-devel] [PATCH 2/2 v2] x86/takdsp: add avx2 versions of all functions

Lynne dev at lynne.ee
Sat Dec 23 12:44:38 EET 2023


Dec 23, 2023, 00:53 by jamrial at gmail.com:

> On an Intel Core i7 12700k:
>
> decorrelate_ls_c: 814.3
> decorrelate_ls_sse2: 165.8
> decorrelate_ls_avx2: 101.3
> decorrelate_sf_c: 1602.6
> decorrelate_sf_sse4: 640.1
> decorrelate_sf_avx2: 324.6
> decorrelate_sm_c: 1564.8
> decorrelate_sm_sse2: 379.3
> decorrelate_sm_avx2: 203.3
> decorrelate_sr_c: 785.3
> decorrelate_sr_sse2: 176.3
> decorrelate_sr_avx2: 99.8
>
> Signed-off-by: James Almer <jamrial at gmail.com>
>

Even better on a Zen3:
checkasm: all 8 tests passed
decorrelate_ls_c: 111.1
decorrelate_ls_sse2: 272.6
decorrelate_ls_avx2: 94.1
decorrelate_sf_c: 170.6
decorrelate_sf_sse4: 400.1
decorrelate_sf_avx2: 196.1
decorrelate_sm_c: 187.6
decorrelate_sm_sse2: 383.1
decorrelate_sm_avx2: 179.1
decorrelate_sr_c: 102.6
decorrelate_sr_sse2: 272.6
decorrelate_sr_avx2: 94.1

Tested, decoding works fine too, LGTM


More information about the ffmpeg-devel mailing list