[FFmpeg-devel] [PATCH 1/4] configure: aarch64: Support assembling the dotprod and i8mm arch extensions

Martin Storsjö martin at martin.st
Sun May 28 00:34:15 EEST 2023


On Sat, 27 May 2023, Rémi Denis-Courmont wrote:

> Le perjantaina 26. toukokuuta 2023, 11.03.12 EEST Martin Storsjö a écrit :
>> These are available since ARMv8.4-a and ARMv8.6-a respectively,
>> but can also be available optionally since ARMv8.2-a.
>> 
>> Check if these are available for use unconditionally (e.g. if compiling
>> with -march=armv8.6-a), or if they can be enabled with specific
>> assembler directives.
>> 
>> Use ".arch_extension <ext>" for enabling a specific extension in
>> assembly; the same can also be achieved with ".arch armv8.2-a+<ext>",
>> but with .arch_extension is easier to combine multiple separate
>> features.
>> 
>> Enabling these extensions requires setting a base architecture level
>> of armv8.2-a with .arch. Don't add ".arch armv8.2-a" unless necessary;
>> if the base level is high enough (which might unlock other extensions
>> without .arch_extension), we don't want to lower it.
>
> I don't follow how that would actually happen, TBH. Even if the default target 
> version is, say, 8.5, the assembler won't magically start emitting 8.5 
> instructions.
>
> Someone would have to write assembler code that would fail to build under a 
> toolchain with a lower target version. That sounds like a bug that should be 
> spotted and fixed, rather than papered over.

I don't see how anything here suggests papering over such an issue?

I'm not sure exactly which parts of the message you refer to here, but 
I'll elaborate on the point about why we only should set .arch if we 
really need to.


Consider a build configuration with -march=armv8.4-a. We test that the 
dotprod extension is available and usable without adding any directives - 
so we won't add any directives for that. We also test that the assembler 
does support i8mm, with ".arch armv8.2-a" plus ".arch_extension i8mm".

But if we do add ".arch armv8.2-a" and ".arch_extension i8mm", then we 
break the dotprod extension. If we only add ".arch_extension i8mm" without 
the .arch directive, we get what we want to though.

> If the problem is to avoid `.arch_extension`, then I don't really see 
> why you can't just use `.arch` with plus, and simplify a lot.

Well Clang doesn't quite support that currently either. For 
".arch_extension dotprod" it errors out since it doesn't recognize the 
dotprod feature in that directive. It does accept ".arch 
armv8.2-a+dotprod" but it doesn't actually unlock using the dotprod 
extension in the assembly despite that. (I'll look into fixing this in 
upstream LLVM afterwards.)

As Clang/LLVM has these limitations/issues currently, one main design 
criterion here is that we shouldn't add any extra .arch/.arch_extension 
directives unless we need and can (and gain some instruction support from 
it).


Taking it back to the drawing board: So for enabling e.g. i8mm, we could 
either do
     .arch armv8.6-a
or
     .arch armv8.2-a+dotprod
or
     .arch armv8.2
     .arch_extension dotprod


Out of these, I initially preferred doing the third approach.

There's no functional difference between the second and third one, except 
the single-line form is more messy to handle, as we can have various 
combinations of what actually is supported. And with the single-line .arch 
form, we can't just add e.g. i8mm on top of a -march= setting that already 
supports dotprod, without respecifying what the toolchain itself defaults 
to.


The documentation for .arch_extension hints at it being possible to 
disable support for extensions with it too, but that doesn't seem to be 
the case in practice. If it was, we could add macros to only enable 
specifically the extensions we want around those functions that should use 
them and nothing more. But I guess if that's not actually supported we 
can't do that.


I guess the alternative would be to just try to set .arch 
<highest-supported-that-we-care-about>. I was worried that support for 
e.g. armv8.6-a appeared later in toolchains than support for the 
individual extension i8mm, but at least from a quick browse in binutils 
history, they seem to have been added at the same time, so there's 
probably no such drawback.

Or what's the situation with e.g. SVE2 - was ".arch_extension sve2" 
supported significantly earlier than ".arch armv9-a"? It looks like 
binutils learnt about sve2 in 2019, but about armv9-a in 2021? OTOH that's 
probably not too much of a real issue either.

If we'd do that, it does simplify the configure logic a fair bit and 
reduces the number of configure variables we need by a lot. It does enable 
a few more instruction set extensions than what we need though, but that's 
probably not a real issue.

// Martin


More information about the ffmpeg-devel mailing list