[FFmpeg-devel] [PATCH] Port extra x264 CPU detection code
Michael Niedermayer
michaelni
Sat Jan 10 03:21:43 CET 2009
On Wed, Jan 07, 2009 at 11:46:59AM -0500, Jason Garrett-Glaser wrote:
> This patch adds two features from x264:
>
> 1. Completely disable SSE2 on Core 1 and Pentium-M CPUs (detected
> using family/model/stepping). These CPUs are so slow at SSE2 that it
> is almost universally slower than MMX. Even with x264's enormous
> library of asm functions, only a single one turned out to be faster on
> SSE2 than MMX, and only by a few clocks, so we simply pretend that
> these CPUs do not have SSE2 at all.
>
> Yes, these CPUs have much slower SSE2 than even the Athlon 64.
>
> 2. Replace 3DNOW with SSE2_IS_SLOW when used for that purpose. This
> is because the Phenom has 3DNOW, but it isn't slow at SSE2.
[...]
> @@ -75,7 +76,7 @@
> if (a == c)
> return 0; /* CPUID not supported */
>
> - cpuid(0, max_std_level, ebx, ecx, edx);
> + cpuid( 0, max_std_level, vendor[0], vendor[2], vendor[1] );
^ ^
Please dont add these spaces
>
> if(max_std_level >= 1){
> cpuid(1, eax, ebx, ecx, std_caps);
> @@ -90,9 +91,21 @@
> if (ecx & 1)
> rval |= FF_MM_SSE3;
> if (ecx & 0x00000200 )
> - rval |= FF_MM_SSSE3
> + rval |= FF_MM_SSSE3;
> #endif
> - ;
this will break compilation when HAVE_SSE is not set
> + if( !strcmp((char*)vendor, "GenuineIntel") ){
> + int family, model, stepping;
> + family = ((eax>>8)&0xf) + ((eax>>20)&0xff);
> + model = ((eax>>4)&0xf) + ((eax>>12)&0xf0);
> + stepping = eax&0xf;
> + /* 6/9 (pentium-m "banias"), 6/13 (pentium-m "dothan"), and 6/14 (core1 "yonah")
> + * theoretically support sse2, but it's significantly slower than mmx for
> + * basically all functions, so let's just pretend they don't. */
> + if( family==6 && (model==9 || model==13 || model==14) ){
> + rval &= ~FF_MM_SSE2;
> + assert(!(rval&FF_MM_SSSE3));
> + }
> + }
> }
>
> cpuid(0x80000000, max_ext_level, ebx, ecx, edx);
i am not entirly happy about lying about the supported feature set.
Though iam not rejecting this, rather i abstain from approving it,
if the others think this is ok so am i with it if not then not.
the rest looks ok
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Democracy is the form of government in which you can choose your dictator
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090110/be1ca2f5/attachment.pgp>
More information about the ffmpeg-devel
mailing list