[FFmpeg-devel] [PATCH] Add check for Athlon64 and similar AMD processors with slow SSE2.

Ronald S. Bultje rsbultje
Sun Feb 6 04:23:12 CET 2011


Hi,

On Sat, Feb 5, 2011 at 10:04 PM, Jason Garrett-Glaser <jason at x264.com> wrote:
> On Sat, Feb 5, 2011 at 5:46 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> Hi,
>>
>> On Fri, Feb 4, 2011 at 1:03 PM, Justin Ruggles <justin.ruggles at gmail.com> wrote:
>>> On 02/04/2011 12:27 PM, Ronald S. Bultje wrote:
>>>> I'm not against the original idea of reusing SSE2SLOW, just make sure
>>>> it's properly documented.
>>>> - SSE2 - CPU supports good SSE2
>>>> - SSE2SLOW (core1 etc.) - CPU supports SSE2 in theory but it's almost
>>>> always slower - only set SSE2 functions if explicitely tested to be
>>>> faster
>>>> - SSE2|SSE2SLOW (athlon64 etc.) - CPU supports SSE2 but it's
>>>> occasionaly slower - don't set SSE2 functions if explicitely tested to
>>>> be slower
>>>>
>>>> And I thought that's what your patch did.
>>>
>>>
>>> It did. But I think it made one of the flag checks more complicated.
>>>
>>> all sse2:
>>> flags & (SSE2 | SSE2SLOW)
>>>
>>> exclude core 1 only:
>>> flags & SSE2
>>>
>>> exclude core 1 and athlon64:
>>> (flags & SSE2) && !(flags & SSE2SLOW)
>>> or
>>> (flags & (SSE2 | SSE2SLOW)) ^ SSE2SLOW
>>
>> flags & (SSE2|SSE2SLOW) == SSE2,
>>
>> (^ SSE2SLOW only flips the slow bit, and then if either bit is non-zero, etc.)
>>
>>> exclude athlon64 only:
>>> (flags & (SSE2 | SSE2SLOW)) && !(flags & SSE2 && flags & SSE2SLOW)
>>> or
>>> (flags & (SSE2 | SSE2SLOW)) ^ (SSE2 | SSE2SLOW)
>>>
>>> The first 3 are self-explanatory, but the last case is not.
>>
>> I don't think it matters. When would you ever want to exclude
>> Athlon64, but not Core1?
>
> Almost any SSE2 function?

Isn't that the other way around?

Ronald



More information about the ffmpeg-devel mailing list