[MPlayer-dev-eng] [PATCH] (new version) AltiVec: dct64 for mp3lib, IMDCT for liba52, detection code
Romain Dolbeau
dolbeau at irisa.fr
Sat Jan 18 18:07:38 CET 2003
Daniel Egger wrote:
> This is strange because a Radeon supports both of them.
The radeon surely. Apple's driver and in particular
the YUV overlay bit... apparently not :-(
> This is completely upside down; I wouldn't remotely rely on a profile
> dump where decoding the whole movie takes less time than copying chunks
> of memory.
Yet most expected functions can be seen in the profile,
so they are profiled.
Also, the machine where the profile was generated was
a 800 Mhz PPC7450 w/o L3 cache, so as soon as you're
out of the 128KB L2, memory accesses are very expensive.
(it's regular PC133, not even DDR on this box...)
If the heavy computations are in-cache and the memory
copies are off-cache, then the copies will eat up
plenty of time. If you (copy, compute_on_copy),
then most of the memory latency will be seen in the
copy instead of the computations.
I don't have a G4 w/ L3 to verify this theory :-(
> You need to compile the whole application with profiling AND link it
> against a proper libc.
I added -pg to config.mak, isn't that enough ? (after removing
all .a and .o, of course). I'm not sure what is a proper libc
for profiling on MacOSX ...
--
Romain Dolbeau
More information about the MPlayer-dev-eng
mailing list