[FFmpeg-devel] [RFC/RFBench] AVX FFT
Vitor Sessak
vitor1001 at gmail.com
Fri Apr 1 19:12:47 CEST 2011
Hi,
The following patches add an AVX (an intel x86 extension) FFT
implementation. Since I do not have a Sandybridge myself, I have no idea
of its performance. Benchmarks (for ex., using fft-test -s) are thus
very welcome. Also welcome are suggestions for optimizing it further, in
particular the 8 point FFT (in the T8_AVX macro), which is not much
faster than the SSE version.
One thing noteworthy about AVX is that it uses 256 bits registers, so
now av_malloc needs to align the pointers to 32-byte boundaries. If this
patch is accepted, I'll have to change a bunch of audio decoders to
increase their buffers' alignment (note that AVX does not crash if a
256-bit load is done on a 128-bit aligned pointer, but it will cause a
cache miss and thus a performance hit).
-Vitor
PS: cross-posted to both lists since I'm interested in feedback from
both groups.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Change-x86-asm-FFT-permutation-to-later-AVX-FFT-addi.patch
Type: text/x-patch
Size: 1888 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20110401/0fd09670/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Add-AVX-FFT-implementation.patch
Type: text/x-patch
Size: 17178 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20110401/0fd09670/attachment-0001.bin>
More information about the ffmpeg-devel
mailing list