[FFmpeg-devel] [PATCH] x86/tx_float: AVX2 SIMD for R2C and C2R RDFTs
Lynne
dev at lynne.ee
Thu Jan 18 20:52:56 EET 2024
Adds full assembly for R2C and C2R transforms
R2C Before:
145370 decicycles in av_tx (r2c), 131072 runs, 0 skips
R2C After:
56897 decicycles in av_tx (r2c), 131072 runs, 0 skips
C2R Before:
140958 decicycles in av_tx (c2r), 131071 runs, 1 skips
C2R After:
50427 decicycles in av_tx (c2r), 131061 runs, 11 skips
C2R does an in-place scatter for the FFT.
R2C could be made a little faster by adding an assembly-only
version of the regular lookup-enabled FFT. In theory, may only
help for really large transforms.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-x86-tx_float-AVX2-SIMD-for-R2C-and-C2R-RDFTs.patch
Type: text/x-diff
Size: 17292 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240118/89840a7c/attachment.patch>
More information about the ffmpeg-devel
mailing list