[MPlayer-users] pentium4 bench

Michael Niedermayer michaelni at gmx.at
Sun Nov 25 03:02:28 CET 2001


Hi

On Sunday 25 November 2001 04:02, Anders Rune Jensen wrote:
> On søn, 2001-11-25 at 01:18, Arpi wrote:
> > Just got access to a pentium4 - 2GHz system with 1GB PC800 RAMBUS ram
> > (fast!) so immediately did some benchmarking on it.
>
> great!
>
[...]
> >
> > note that p4 2hgz doesn't seem to be 2x (or more) faster than 1ghz
> > celeron-2. it's strange, as it has very fast ram (400mhz rambus compared
the p4 needs more time for most things than the p3/celeron 
for example it needs 4 cpu cycles for a shift the p3 needs 1 and for a 
multiply it needs 14 the p3 needs 4
allthough the p4 can execute simple instructions faster (+,-, and, or, ...)
but it can only execute 3 instruction per cpu cycle at max like te p3/cel.
it has only 8kb cache the p3/cel has 16kb (and k7 has 64kb)
all of this is afaik (i dont have a p4 and didnt read much of intels p4 opt 
manual yet ... i dont like these several hundred page manuals)

> > to 133mhz sdram) and also the cpu has double clockrate.
> > maybe the code should be optimized differently for p4?
yes :(

>
> The code should definetly be uptimized for Pentium 4. The Pentium 4 uses
> a very special pipeline and is relying heavily on SSE2. I think Intel
SSE2 hmm i dunno, intels manual says that nearly all SSE/SSE2 instructions 
have a throuput of 2 (which means 1 instruction every 2 cpu cycles) MMX is 2x 
as fast and MMX workes on 8 bytes while SSE(2) works on 16 so there wont be 
much difference IMHO but i dont have a p4 and so its just guessing
at least for postprocessing SSE2 is not an option because mpeg124 uses 8x8 
blocks and so mmx is a much better choice

> released some documents describing how to uptimze your code for P4.
http://developer.intel.com/design/pentium4/manuals/248966.htm

[...]

Michael



More information about the MPlayer-users mailing list