[FFmpeg-devel] [PATCH] RFC: v210enc optimisations and initial AVX-512
Kieran Kunhya
kierank at obe.tv
Fri Oct 21 06:41:17 EEST 2022
Hi,
Please see attached an attempt to optimise the 8-bit input to v210enc to
reduce the number of shuffles.
This comes at the cost of having to extract the middle element and perform
a DWORD shift on it and then reinserting it.
I have added a few comments but any other ideas are welcome.
Crude benchmarks on Intel(R) Xeon(R) D-2123IT:
Before:
v210_planar_pack_8_ssse3: 316.5
v210_planar_pack_8_avx: 319.0
v210_planar_pack_8_avx2: 223.0
After:
v210_planar_pack_8_ssse3: 321.0
v210_planar_pack_8_avx: 326.0
v210_planar_pack_8_avx2: 217.0
v210_planar_pack_8_avx512: 211.0
Regards,
Kieran Kunhya
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-RFC-v210enc-optimisations-and-initial-AVX-512.patch
Type: application/octet-stream
Size: 4642 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20221021/e8ed64a2/attachment.obj>
More information about the ffmpeg-devel
mailing list