[FFmpeg-devel] [PATCH 1/4] configure: aarch64: Support assembling the dotprod and i8mm arch extensions
Martin Storsjö
martin at martin.st
Sun May 28 00:34:15 EEST 2023
On Sat, 27 May 2023, Rémi Denis-Courmont wrote:
> Le perjantaina 26. toukokuuta 2023, 11.03.12 EEST Martin Storsjö a écrit :
>> These are available since ARMv8.4-a and ARMv8.6-a respectively,
>> but can also be available optionally since ARMv8.2-a.
>>
>> Check if these are available for use unconditionally (e.g. if compiling
>> with -march=armv8.6-a), or if they can be enabled with specific
>> assembler directives.
>>
>> Use ".arch_extension <ext>" for enabling a specific extension in
>> assembly; the same can also be achieved with ".arch armv8.2-a+<ext>",
>> but with .arch_extension is easier to combine multiple separate
>> features.
>>
>> Enabling these extensions requires setting a base architecture level
>> of armv8.2-a with .arch. Don't add ".arch armv8.2-a" unless necessary;
>> if the base level is high enough (which might unlock other extensions
>> without .arch_extension), we don't want to lower it.
>
> I don't follow how that would actually happen, TBH. Even if the default target
> version is, say, 8.5, the assembler won't magically start emitting 8.5
> instructions.
>
> Someone would have to write assembler code that would fail to build under a
> toolchain with a lower target version. That sounds like a bug that should be
> spotted and fixed, rather than papered over.
I don't see how anything here suggests papering over such an issue?
I'm not sure exactly which parts of the message you refer to here, but
I'll elaborate on the point about why we only should set .arch if we
really need to.
Consider a build configuration with -march=armv8.4-a. We test that the
dotprod extension is available and usable without adding any directives -
so we won't add any directives for that. We also test that the assembler
does support i8mm, with ".arch armv8.2-a" plus ".arch_extension i8mm".
But if we do add ".arch armv8.2-a" and ".arch_extension i8mm", then we
break the dotprod extension. If we only add ".arch_extension i8mm" without
the .arch directive, we get what we want to though.
> If the problem is to avoid `.arch_extension`, then I don't really see
> why you can't just use `.arch` with plus, and simplify a lot.
Well Clang doesn't quite support that currently either. For
".arch_extension dotprod" it errors out since it doesn't recognize the
dotprod feature in that directive. It does accept ".arch
armv8.2-a+dotprod" but it doesn't actually unlock using the dotprod
extension in the assembly despite that. (I'll look into fixing this in
upstream LLVM afterwards.)
As Clang/LLVM has these limitations/issues currently, one main design
criterion here is that we shouldn't add any extra .arch/.arch_extension
directives unless we need and can (and gain some instruction support from
it).
Taking it back to the drawing board: So for enabling e.g. i8mm, we could
either do
.arch armv8.6-a
or
.arch armv8.2-a+dotprod
or
.arch armv8.2
.arch_extension dotprod
Out of these, I initially preferred doing the third approach.
There's no functional difference between the second and third one, except
the single-line form is more messy to handle, as we can have various
combinations of what actually is supported. And with the single-line .arch
form, we can't just add e.g. i8mm on top of a -march= setting that already
supports dotprod, without respecifying what the toolchain itself defaults
to.
The documentation for .arch_extension hints at it being possible to
disable support for extensions with it too, but that doesn't seem to be
the case in practice. If it was, we could add macros to only enable
specifically the extensions we want around those functions that should use
them and nothing more. But I guess if that's not actually supported we
can't do that.
I guess the alternative would be to just try to set .arch
<highest-supported-that-we-care-about>. I was worried that support for
e.g. armv8.6-a appeared later in toolchains than support for the
individual extension i8mm, but at least from a quick browse in binutils
history, they seem to have been added at the same time, so there's
probably no such drawback.
Or what's the situation with e.g. SVE2 - was ".arch_extension sve2"
supported significantly earlier than ".arch armv9-a"? It looks like
binutils learnt about sve2 in 2019, but about armv9-a in 2021? OTOH that's
probably not too much of a real issue either.
If we'd do that, it does simplify the configure logic a fair bit and
reduces the number of configure variables we need by a lot. It does enable
a few more instruction set extensions than what we need though, but that's
probably not a real issue.
// Martin
More information about the ffmpeg-devel
mailing list