[FFmpeg-devel] [PATCH 1/2] configure: add check for AVX inline support
Reimar Döffinger
Reimar.Doeffinger at gmx.de
Sun May 25 10:00:37 CEST 2014
On 16.05.2014, at 17:40, "Ronald S. Bultje" <rsbultje at gmail.com> wrote:
> Hi,
>
> On Fri, May 16, 2014 at 11:29 AM, Michael Niedermayer <michaelni at gmx.at>wrote:
>
>> On Fri, May 16, 2014 at 07:49:24AM -0400, Ronald S. Bultje wrote:
>>> Hi,
>>>
>>> On Thu, May 15, 2014 at 11:42 PM, Michael Niedermayer <michaelni at gmx.at
>>> wrote:
>>>
>>>> On Thu, May 15, 2014 at 07:03:02PM -0300, James Almer wrote:
>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>>> ---
>>>>> configure | 3 ++-
>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> applied
>>>
>>>
>>> I have big objections to this. Inline is unreadable, unportable (e.g.
>>> doesn't work on MSVC) and virtually nobody understands inline. It's
>> beyond
>>> me that anyone wants to write avx in this atrocity of a syntax.
>>>
>>> Can we please revert this and rewrite patch 2/2 in yasm syntax? I would
>> be
>>> greatly thankful.
>>
>> James tried this with the SSE2 code already,
>> see: [FFmpeg-devel] [PATCH 1/2] swresample: Refactor resample asm and port
>> it to yasm
>>
>> In the best case i could reproduce in that thread yasm was 14% slower
>> than gcc inline for SSE2.
>> Was there a flaw in how i tested ?
>> Do you suggest that we should use yasm even if it makes the code
>> significantly slower?
>> Does someone volunteer to write the whole loop in yasm so the calling
>> overhead is avoided?
>>
>> Other ideas that iam missing ?
>>
>> about reverting, i can revert if people want, though it seems a bit
>> overreacting to me to revert this optimization before a similar fast
>> yasm implementation exists.
>> Thats unless it causes some regression ?
>
>
> It doesn't work under MSVC, that's pretty massive.
>
> And yes, of course, if the function is inlined in another, the whole loop
> should be a single function in yasm, otherwise it's a silly comparison.
>
> But we should do that, so let's not reintroduce sloppy inline asm just
> because we're lazy. I don't mind keeping what we have, but don't make it
> worse.
Note that there are also disadvantages (though they don't matter much since hardly anyone works on non-x86).
Requiring all-asm versions means addition effort to support new architectures for example, and often with no benefit, as e.g. for ARM we do not have any way to support asm in MSVC at all (except, I guess, compiling only the asm with gcc, but then you could almost just compile the whole inline asm file with gcc).
I admit this isn't a properly thought-out comment, I just felt like a lot of the asm discussion is x86-only, which while understandable still seems a bit one-sided.
More information about the ffmpeg-devel
mailing list