[MPlayer-dev-eng] [RFC] disable fastmemcpy on x86-64 by default
    Attila Kinali 
    attila at kinali.ch
       
    Sun May 27 22:47:55 CEST 2007
    
    
  
On Sun, 27 May 2007 18:19:48 +0200
Reimar D?ffinger <Reimar.Doeffinger at stud.uni-karlsruhe.de> wrote:
> Hello,
> since SSE is part of the x86-64 architecture, at least glibc makes use
> of it for its memcpy and some quick (and imprecise) tests indicate that
> it's at least not slower.
> So what do you think about attached patch? Can someone do more concise
> benchmarks?
Here some benchmarks:
System:
attila at jashugan:~ # uname -a
Linux jashugan 2.6.18 #1 Wed Sep 27 17:50:21 CEST 2006 x86_64 GNU/Linux
attila at jashugan:~ # cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 55
model name      : AMD Athlon(tm) 64 Processor 3700+
stepping        : 2
cpu MHz         : 2202.856
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm
bogomips        : 4409.53
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
attila at jashugan:~ # dpkg -s libc6|grep Version
Version: 2.3.6.ds1-13
attila at jashugan:~ # dpkg -s gcc|grep Version
Version: 4:4.1.1-15
attila at jashugan:~ # free -m
             total       used       free     shared    buffers     cached
Mem:          2012       1952         59          0          4       1125
-/+ buffers/cache:        822       1190
Swap:         7812          0       7812
Graphics card is a Matrox G550, used vo: xmga
All benchmarks are best of 3, with one burn in, run from a local
sata disk (resp after burn in from RAM)
standard parameters: -quiet -nosound -benchmark
-------------------------------------------------
Benchmark 1:
(frist 2000 frames only)
MPlayer dev-SVN-r23390-4.1.2 (C) 2000-2007 MPlayer Team
CPU: AMD Athlon(tm) 64 Processor 3700+ (Family: 15, Model: 55, Stepping: 2)
CPUflags:  MMX: 1 MMX2: 1 3DNow: 1 3DNow2: 1 SSE: 1 SSE2: 1
Compiled for x86 CPU with extensions: MMX MMX2 3DNow 3DNowEx SSE SSE2
Playing /tmp/[Ayako]_Seto_no_Hanayome_-_01_(H264)_[951E16B9].mkv.
[mkv] Track ID 1: video (V_MPEG4/ISO/AVC), -vid 0
[mkv] Track ID 2: audio (A_AAC), -aid 0, -alang und
[mkv] Track ID 3: subtitles (S_TEXT/ASS), -sid 0, -slang und
[mkv] Track ID 4: subtitles (S_TEXT/UTF8), -sid 1, -slang und
[mkv] Will play video track 1.
[mkv] No audio track found/wanted.
Matroska file format detected.
VIDEO:  [avc1]  1280x720  24bpp  23.976 fps    0.0 kbps ( 0.0 kbyte/s)
==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
Selected video codec: [ffh264] vfm: ffmpeg (FFmpeg H.264)
==========================================================================
Audio: no sound
Starting playback...
VDec: vo config request - 1280 x 720 (preferred colorspace: Planar YV12)
VDec: using Planar YV12 as output csp (no 0)
Movie-Aspect is 1.78:1 - prescaling to correct movie aspect.
VO: [xmga] 1280x720 => 1280x720 Planar YV12 
[MGA] Using 3 buffers.
w/o patch:
BENCHMARKs: VC:  19.974s VO:  44.601s A:   0.000s Sys:   0.093s =   64.668s
BENCHMARK%: VC: 30.8864% VO: 68.9698% A:  0.0000% Sys:  0.1437% = 100.0000%
w/ patch:
BENCHMARKs: VC:  19.889s VO:  44.503s A:   0.000s Sys:   0.091s =   64.484s
BENCHMARK%: VC: 30.8437% VO: 69.0146% A:  0.0000% Sys:  0.1416% = 100.0000%
-------------------------------------------------
Benchmark 2:
MPlayer dev-SVN-r23390-4.1.2 (C) 2000-2007 MPlayer Team
CPU: AMD Athlon(tm) 64 Processor 3700+ (Family: 15, Model: 55, Stepping: 2)
CPUflags:  MMX: 1 MMX2: 1 3DNow: 1 3DNow2: 1 SSE: 1 SSE2: 1
Compiled for x86 CPU with extensions: MMX MMX2 3DNow 3DNowEx SSE SSE2
Playing /tmp/Inuyasha Movie Commercial 01 Dvd Rip.avi.
AVI file format detected.
[aviheader] Video stream found, -vid 0
[aviheader] Audio stream found, -aid 1
VIDEO:  [DIVX]  720x480  24bpp  23.976 fps  4038.1 kbps (492.9 kbyte/s)
Clip info:
 Software: Nandub v1.0rc2
==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
Selected video codec: [ffodivx] vfm: ffmpeg (FFmpeg MPEG-4)
==========================================================================
Audio: no sound
Starting playback...
[mpeg4 @ 0xcedc20]looks like this file was encoded with (divx4/(old)xvid/opendivx) -> forcing low_delay flag
VDec: vo config request - 720 x 480 (preferred colorspace: Planar YV12)
VDec: using Planar YV12 as output csp (no 0)
Movie-Aspect is undefined - no prescaling applied.
VO: [xmga] 720x480 => 720x480 Planar YV12 
[MGA] Using 3 buffers.
w/o patch:
BENCHMARKs: VC:   9.980s VO:   0.003s A:   0.000s Sys:   0.046s =   10.029s
BENCHMARK%: VC: 99.5084% VO:  0.0336% A:  0.0000% Sys:  0.4580% = 100.0000%
w/ patch:
BENCHMARKs: VC:   8.833s VO:   0.003s A:   0.000s Sys:   0.047s =    8.883s
BENCHMARK%: VC: 99.4307% VO:  0.0367% A:  0.0000% Sys:  0.5326% = 100.0000%
-------------------------------------------------
Benchmark 3:
MPlayer dev-SVN-r23390-4.1.2 (C) 2000-2007 MPlayer Team
CPU: AMD Athlon(tm) 64 Processor 3700+ (Family: 15, Model: 55, Stepping: 2)
CPUflags:  MMX: 1 MMX2: 1 3DNow: 1 3DNow2: 1 SSE: 1 SSE2: 1
Compiled for x86 CPU with extensions: MMX MMX2 3DNow 3DNowEx SSE SSE2
Playing /tmp/[AnY-Spork] Iriya no Sora, UFO no Natsu - 1 [DVD-MP3][46B1F913].avi.
AVI file format detected.
[aviheader] Video stream found, -vid 0
[aviheader] Audio stream found, -aid 1
VIDEO:  [XVID]  640x480  24bpp  23.976 fps  1064.2 kbps (129.9 kbyte/s)
Clip info:
 Software: VirtualDubMod 1.5.10.1 (build 2366/release)
==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
Selected video codec: [ffodivx] vfm: ffmpeg (FFmpeg MPEG-4)
==========================================================================
Audio: no sound
Starting playback...
VDec: vo config request - 640 x 480 (preferred colorspace: Planar YV12)
VDec: using Planar YV12 as output csp (no 0)
Movie-Aspect is 1.33:1 - prescaling to correct movie aspect.
VO: [xmga] 640x480 => 640x480 Planar YV12 
[MGA] Using 3 buffers.
w/o patch:
BENCHMARKs: VC: 307.732s VO:   0.126s A:   0.000s Sys:   1.025s =  308.883s
BENCHMARK%: VC: 99.6274% VO:  0.0407% A:  0.0000% Sys:  0.3319% = 100.0000%
w/ patch:
BENCHMARKs: VC: 307.750s VO:   0.112s A:   0.000s Sys:   1.093s =  308.954s
BENCHMARK%: VC: 99.6102% VO:  0.0361% A:  0.0000% Sys:  0.3537% = 100.0000%
-------------------------------------------------
Benchmark 4:
V for Vendetta DVD, coppied to disk, with -ss 4:00 -frames 4000
MPlayer dev-SVN-r23390-4.1.2 (C) 2000-2007 MPlayer Team
CPU: AMD Athlon(tm) 64 Processor 3700+ (Family: 15, Model: 55, Stepping: 2)
CPUflags:  MMX: 1 MMX2: 1 3DNow: 1 3DNow2: 1 SSE: 1 SSE2: 1
Compiled for x86 CPU with extensions: MMX MMX2 3DNow 3DNowEx SSE SSE2
Playing dvd://1.
libdvdread: Couldn't find device name.
There are 2 titles on this DVD.
There are 12 chapters in this DVD title.
There are 1 angles in this DVD title.
audio stream: 0 format: ac3 (stereo) language: unknown aid: 128.
number of audio channels on disk: 1.
number of subtitles on disk: 0
MPEG-PS file format detected.
VIDEO:  MPEG2  720x480  (aspect 3)  29.970 fps    0.0 kbps ( 0.0 kbyte/s)
==========================================================================
Opening video decoder: [mpegpes] MPEG 1/2 Video passthrough
VDec: vo config request - 720 x 480 (preferred colorspace: Mpeg PES)
Could not find matching colorspace - retrying with -vf scale...
Opening video filter: [scale]
The selected video_out device is incompatible with this codec.
Try appending the scale filter to your filter list,
e.g. -vf spp,scale instead of -vf spp.
VDecoder init failed :(
Opening video decoder: [libmpeg2] MPEG 1/2 Video decoder libmpeg2-v0.4.0b
Selected video codec: [mpeg12] vfm: libmpeg2 (MPEG-1 or 2 (libmpeg2))
==========================================================================
Audio: no sound
Starting playback...
VDec: vo config request - 720 x 480 (preferred colorspace: Planar YV12)
VDec: using Planar YV12 as output csp (no 0)
Movie-Aspect is 1.78:1 - prescaling to correct movie aspect.
VO: [xmga] 720x480 => 854x480 Planar YV12 
[MGA] Using 3 buffers.
w/o patch:
BENCHMARKs: VC:   6.520s VO:  42.695s A:   0.000s Sys:   0.313s =   49.528s
BENCHMARK%: VC: 13.1638% VO: 86.2044% A:  0.0000% Sys:  0.6318% = 100.0000%
w/ patch:
BENCHMARKs: VC:   6.178s VO:  36.899s A:   0.000s Sys:   0.308s =   43.386s
BENCHMARK%: VC: 14.2394% VO: 85.0499% A:  0.0000% Sys:  0.7107% = 100.0000%
-------------------------------------------------
Benchmark 5:
(first 10000 frames only)
MPlayer dev-SVN-r23390-4.1.2 (C) 2000-2007 MPlayer Team
CPU: AMD Athlon(tm) 64 Processor 3700+ (Family: 15, Model: 55, Stepping: 2)
CPUflags:  MMX: 1 MMX2: 1 3DNow: 1 3DNow2: 1 SSE: 1 SSE2: 1
Compiled for x86 CPU with extensions: MMX MMX2 3DNow 3DNowEx SSE SSE2
Playing /tmp/Hana Bi.avi.
AVI file format detected.
[aviheader] Video stream found, -vid 0
[aviheader] Audio stream found, -aid 1
VIDEO:  [DIVX]  560x320  24bpp  23.976 fps  809.7 kbps (98.8 kbyte/s)
==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
Selected video codec: [ffodivx] vfm: ffmpeg (FFmpeg MPEG-4)
==========================================================================
Audio: no sound
Starting playback...
[mpeg4 @ 0xcedc20]looks like this file was encoded with (divx4/(old)xvid/opendivx) -> forcing low_delay flag
VDec: vo config request - 560 x 320 (preferred colorspace: Planar YV12)
VDec: using Planar YV12 as output csp (no 0)
Movie-Aspect is undefined - no prescaling applied.
VO: [xmga] 560x320 => 560x320 Planar YV12 
[MGA] Using 3 buffers.
w/o patch:
BENCHMARKs: VC:  71.232s VO:   0.018s A:   0.000s Sys:   0.199s =   71.449s
BENCHMARK%: VC: 99.6961% VO:  0.0257% A:  0.0000% Sys:  0.2781% = 100.0000%
w/ patch:
BENCHMARKs: VC:  61.344s VO:   0.019s A:   0.000s Sys:   0.174s =   61.537s
BENCHMARK%: VC: 99.6857% VO:  0.0316% A:  0.0000% Sys:  0.2827% = 100.0000%
-------------------------------------------------
I also sinlge-run tested a few other samples similar to benchmark 1 and 3
(ie animes with divx3, divx4, xvid, h.264) codecs that didn't show any
siginificant speed difference (<1%)
Interesting are benchmark 2 and 5, which both are faster with
the patch. They are also the only ones i came across that 
were decoded using the low_delay flag.
If someone is interested in this, i could search for more samples
of this kind, i should have some.
				Attila Kinali
-- 
Linux ist... wenn man einfache Dinge auch mit einer kryptischen
post-fix Sprache loesen kann
                        -- Daniel Hottinger
    
    
More information about the MPlayer-dev-eng
mailing list