[Ffmpeg-devel] [REQUEST] MMX/MMX2 and SSE optimizations for H.264 decoding
Martin Boehme
boehme
Fri Sep 16 14:20:42 CEST 2005
Loren Merritt wrote:
> On Thu, 15 Sep 2005, Martin Boehme wrote:
>
>> Gamester17 wrote:
>>
>>> Yes there already are some MMX integer optimization for H264 but what
>>> about SSE (Streaming SIMD Extensions) optimizations?, isn't SSE
>>> suppose to be much more powerfull than MMX (and in fact be the thing
>>> that replaces MMX)?
>>
>>
>> Well, for a start, SSE has registers that are 128 bits wide, while
>> MMX's registers are 64 bits. As long as you're operating only on the
>> registers (i.e. you're CPU-bound, not memory bandwidth limited) that's
>> an instant factor of 2 speedup.
>
> On AMD, most SSE2 instructions take exactly twice as long as the
> equivalent MMX instruction. Any speedups are due only to scheduling.
> In x264, we have a bunch of SSE2 functions, but most of them are
> _slower_ than the MMX versions on AMD.
Interesting -- wasn't aware of that. I would assume that the AMD
processors only have enough execution units for 64 bits worth of data
and have to do SSE operations in two gos?
> On Intel, yes SSE2 is faster, but still not a full factor of 2 even
> before you count memory bandwidth.
Thanks for the info!
Martin
--
Martin B?hme
Inst. f. Neuro- and Bioinformatics
Ratzeburger Allee 160, D-23538 Luebeck
Phone: +49 451 500 5514
Fax: +49 451 500 5502
boehme at inb.uni-luebeck.de
More information about the ffmpeg-devel
mailing list