[MPlayer-dev-eng] [PATCH] (new version) AltiVec: dct64 for mp3lib, IMDCT for liba52, detection code

Daniel Egger degger at fhm.edu
Sat Jan 18 17:09:31 CET 2003


Am Sam, 2003-01-18 um 14.34 schrieb Romain Dolbeau:

> I don't doubt libmpeg2, I was talking about my rip-off
> of the code included in my mplayer patch. I won't claim
> it works until I've tested it :-)

Sure, but the approach is cool. :)

> What's MC ? I know absolutely nothing on image/soud/signal
> processing, I'm just a comp.arch guy trying to spare his
> precious CPU cycles :-)

Motion compensation, basically the code that reassembles moved
blocks into the final picture.

> The number one CPU eater is YUV420To2VUY_W1x, a QuickTime
> function. I assume that '2VUY' is the Radeon favorite
> YUV format for overlay, and as libSDL doesn't handle it
> directly, QuickTime converts the data on-the-fly.

This is strange because a Radeon supports both of them.

>    %   cumulative   self              self     total
>   time   seconds   seconds    calls  ms/call  ms/call  name
>   18.8       9.74     9.74                             _YUV420To2VUY_W1x [3]
>   18.4      19.24     9.50                             _moncount (99654)
>    6.5      22.58     3.34                             _memmove [12]
>    5.4      25.40     2.82                             mcount (15826)
>    4.0      27.46     2.06                             _gmc1_altivec [14]
>    3.8      29.43     1.97                             _memset [15]
>    2.4      30.67     1.24                             _mach_msg_overwrite [19]
>    2.3      31.84     1.17                             _nanosleep [20]
>    2.2      32.99     1.15                             _RateConvertStereo16AltiVec [21]
>    2.0      34.04     1.05 13765710     0.00     0.00  _mpeg4_decode_block [22]
>    1.7      34.94     0.90   164880     0.01     0.01  _synth_1to1 [23]
>    1.7      35.82     0.88                             _syscall [24]
>    1.6      36.65     0.83                             _put_pixels16_altivec [25]
>    1.6      37.46     0.81  2457600     0.00     0.00  _MPV_decode_mb [9]
>    1.5      38.26     0.80                             _loadVectorShort [26]
>    1.3      38.95     0.69                             _idct_add_altivec [27]
>    1.1      39.53     0.58   901254     0.00     0.00  _put_pixels8_xy2_c [28]
>    1.0      40.07     0.54  2829604     0.00     0.00  _mpeg_motion [11]
>    1.0      40.58     0.51                             _avg_pixels16_altivec [29]
>    1.0      41.09     0.51                             _sdevCopyBuffer [30]
> #####

This is completely upside down; I wouldn't remotely rely on a profile
dump where decoding the whole movie takes less time than copying chunks
of memory.

You need to compile the whole application with profiling AND link it
against a proper libc.

-- 
Servus,
       Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Dies ist ein digital signierter Nachrichtenteil
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20030118/1caf33f5/attachment.pgp>


More information about the MPlayer-dev-eng mailing list