[FFmpeg-devel] [PATCH] RFC: v210enc optimisations and initial AVX-512

Kieran Kunhya kierank at obe.tv
Fri Oct 21 06:41:17 EEST 2022


Hi,

Please see attached an attempt to optimise the 8-bit input to v210enc to
reduce the number of shuffles.
This comes at the cost of having to extract the middle element and perform
a DWORD shift on it and then reinserting it.
I have added a few comments but any other ideas are welcome.

Crude benchmarks on Intel(R) Xeon(R) D-2123IT:

Before:

v210_planar_pack_8_ssse3: 316.5
v210_planar_pack_8_avx: 319.0
v210_planar_pack_8_avx2: 223.0

After:

v210_planar_pack_8_ssse3: 321.0
v210_planar_pack_8_avx: 326.0
v210_planar_pack_8_avx2: 217.0
v210_planar_pack_8_avx512: 211.0

Regards,
Kieran Kunhya
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-RFC-v210enc-optimisations-and-initial-AVX-512.patch
Type: application/octet-stream
Size: 4642 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20221021/e8ed64a2/attachment.obj>


More information about the ffmpeg-devel mailing list