[MPlayer-dev-eng] [PATCH] vf_eq2 extensions
D Richard Felker III
dalias at aerifal.cx
Fri Jan 31 22:44:43 CET 2003
On Fri, Jan 31, 2003 at 08:11:17PM +0100, Hampa Hug wrote:
> D Richard Felker III wrote:
>
> > On Fri, Jan 31, 2003 at 06:17:20PM +0100, Michael Niedermayer wrote:
> > > i doubt that evaluating a polynom is faster than a single L1 cache read from a
> > > 256 byte LUT
> >
> > Nope, it's not. But with MMX, evaluating 4 polynomials is just as fast
> > as evaluating one. :) And loading a single byte from memory, then
> > immediately using it as a 32bit offset into a lookup table, is VERY
> > SLOW. x86 cpu's don't like mixing register sizes these days. I spent a
> > lot of time trying to improve performance on stuff just like this in
> > another program, and I never could get it to work as fast as I wanted.
> >
> > BTW, as a reference, eq2 uses well over twice the cpu time of eq on my
> > system, last I checked.
>
> On my system (ultrasparc) eq2 is about twice as fast as eq (when
> performing the same task). I doubt that there is a single solution
> that is always faster.
That's because the C code was never optimized, and your ultrasparc
can't use x86 mmx code. :)
> Could you explain the register size mixing a bit more? Are you saying
> that the 'movzx' instruction is slow?
No, not on recent cpu's anyway. Read the message I just posted to this
thread.
Rich
More information about the MPlayer-dev-eng
mailing list