[FFmpeg-devel] [PATCH] 'vorbis_residue_decode' optimizations
Loren Merritt
lorenm
Tue Sep 9 13:47:28 CEST 2008
On Tue, 9 Sep 2008, Siarhei Siamashka wrote:
> On Wednesday 03 September 2008, Michael Niedermayer wrote:
>>
>> This could be added as a SHOW_CONST_UBITS
>> also gcc should be able to build the mask itself at compile time as long as
>> no asm shift tricks re used.
>
> Sure. The only problem is that it would be nice to use the same macro for both
> constant and non-constant expressions. Adding one more macro does not add much
> convenience because the compiler can't either insert a constant or use asm
> shift trick automatically. Or can it?
__builtin_constant_p
> Some basic SSE optimizations are added, most likely they still can be
> improved.
You could try decoding residual in channel-interleaved order, do that
consecutive codebook entries are consecutive in decoded memory. The simd
savings might be worth an extra copy to deinterleave afterward.
Better yet but more complex would be to decode residual in channel-
interleaved order and don't deinterleave. That would reduce the number of
shuffles in mdct/fft (for 2 or 4 channels), but would require new fft
asm.
--Loren Merritt
More information about the ffmpeg-devel
mailing list