[MPlayer-dev-eng] [PATCH] vf_eq2 extensions
Hampa Hug
hampa at hampa.ch
Fri Jan 31 20:11:17 CET 2003
D Richard Felker III wrote:
> On Fri, Jan 31, 2003 at 06:17:20PM +0100, Michael Niedermayer wrote:
> > i doubt that evaluating a polynom is faster than a single L1 cache read from a
> > 256 byte LUT
>
> Nope, it's not. But with MMX, evaluating 4 polynomials is just as fast
> as evaluating one. :) And loading a single byte from memory, then
> immediately using it as a 32bit offset into a lookup table, is VERY
> SLOW. x86 cpu's don't like mixing register sizes these days. I spent a
> lot of time trying to improve performance on stuff just like this in
> another program, and I never could get it to work as fast as I wanted.
>
> BTW, as a reference, eq2 uses well over twice the cpu time of eq on my
> system, last I checked.
On my system (ultrasparc) eq2 is about twice as fast as eq (when
performing the same task). I doubt that there is a single solution
that is always faster.
Could you explain the register size mixing a bit more? Are you saying
that the 'movzx' instruction is slow?
Hampa
More information about the MPlayer-dev-eng
mailing list