[MPlayer-users] pentium4 bench
Michael Niedermayer
michaelni at gmx.at
Sun Nov 25 03:02:28 CET 2001
Hi
On Sunday 25 November 2001 04:02, Anders Rune Jensen wrote:
> On søn, 2001-11-25 at 01:18, Arpi wrote:
> > Just got access to a pentium4 - 2GHz system with 1GB PC800 RAMBUS ram
> > (fast!) so immediately did some benchmarking on it.
>
> great!
>
[...]
> >
> > note that p4 2hgz doesn't seem to be 2x (or more) faster than 1ghz
> > celeron-2. it's strange, as it has very fast ram (400mhz rambus compared
the p4 needs more time for most things than the p3/celeron
for example it needs 4 cpu cycles for a shift the p3 needs 1 and for a
multiply it needs 14 the p3 needs 4
allthough the p4 can execute simple instructions faster (+,-, and, or, ...)
but it can only execute 3 instruction per cpu cycle at max like te p3/cel.
it has only 8kb cache the p3/cel has 16kb (and k7 has 64kb)
all of this is afaik (i dont have a p4 and didnt read much of intels p4 opt
manual yet ... i dont like these several hundred page manuals)
> > to 133mhz sdram) and also the cpu has double clockrate.
> > maybe the code should be optimized differently for p4?
yes :(
>
> The code should definetly be uptimized for Pentium 4. The Pentium 4 uses
> a very special pipeline and is relying heavily on SSE2. I think Intel
SSE2 hmm i dunno, intels manual says that nearly all SSE/SSE2 instructions
have a throuput of 2 (which means 1 instruction every 2 cpu cycles) MMX is 2x
as fast and MMX workes on 8 bytes while SSE(2) works on 16 so there wont be
much difference IMHO but i dont have a p4 and so its just guessing
at least for postprocessing SSE2 is not an option because mpeg124 uses 8x8
blocks and so mmx is a much better choice
> released some documents describing how to uptimze your code for P4.
http://developer.intel.com/design/pentium4/manuals/248966.htm
[...]
Michael
More information about the MPlayer-users
mailing list