[FFmpeg-devel] [PATCH] x86/tx_float: AVX2 SIMD for R2C and C2R RDFTs

Lynne dev at lynne.ee
Thu Jan 18 20:52:56 EET 2024


Adds full assembly for R2C and C2R transforms

R2C Before:
145370 decicycles in           av_tx (r2c),  131072 runs,      0 skips
R2C After:
56897 decicycles in           av_tx (r2c),  131072 runs,      0 skips

C2R Before:
140958 decicycles in           av_tx (c2r),  131071 runs,      1 skips
C2R After:
50427 decicycles in           av_tx (c2r),  131061 runs,     11 skips

C2R does an in-place scatter for the FFT.
R2C could be made a little faster by adding an assembly-only
version of the regular lookup-enabled FFT. In theory, may only
help for really large transforms.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-x86-tx_float-AVX2-SIMD-for-R2C-and-C2R-RDFTs.patch
Type: text/x-diff
Size: 17292 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240118/89840a7c/attachment.patch>


More information about the ffmpeg-devel mailing list