[FFmpeg-devel] [PATCH 1/4] configure: aarch64: Support assembling the dotprod and i8mm arch extensions

Martin Storsjö martin at martin.st
Tue May 30 15:25:25 EEST 2023


On Sun, 28 May 2023, Rémi Denis-Courmont wrote:

> Le sunnuntaina 28. toukokuuta 2023, 0.34.15 EEST Martin Storsjö a écrit :
>
>> I guess the alternative would be to just try to set .arch
>> <highest-supported-that-we-care-about>. I was worried that support for
>> e.g. armv8.6-a appeared later in toolchains than support for the
>> individual extension i8mm, but at least from a quick browse in binutils
>> history, they seem to have been added at the same time, so there's
>> probably no such drawback.
>> 
>> Or what's the situation with e.g. SVE2 - was ".arch_extension sve2"
>> supported significantly earlier than ".arch armv9-a"?
>
> I have not tested SVE on LLVM. AFAIK, SVE and SVE2 are optional from 8.2 and 
> 9.0 onward respectively, and not mandatory in any version, so if your 
> toolchain supports neither .arch with plus sign, nor .arch_extension, it is 
> game over.

I didn't meant specifically whether LLVM supports it here, just in general 
wrt binutils and how to enable the feature.

FWIW it seems like SVE2 is a mandatory part of 9.0 - assembling SVE2 
instructions can be done with ".arch armv9-a". But there are about 2 years 
worth of deployed binutils based toolchains that do recognize ".arch 
armv8.2-a; .arch_extension sve2" but don't recognize ".arch armv9-a".

So for the generic mechanism for enabling cpu features, I'd prefer to keep 
the mechanism using primarily .arch_extension (with .arch set as high as 
necessary) rather than relying solely on .arch <version> without any extra 
+<feature>.

>> If we'd do that, it does simplify the configure logic a fair bit and
>> reduces the number of configure variables we need by a lot. It does enable
>> a few more instruction set extensions than what we need though, but that's
>> probably not a real issue.
>
> Yes.

I made an attempt at simplifying the logic in configure and asm.S 
somewhat, while still primarily using .arch_extension, and while making 
sure we still can get the features assembled with current Clang with a 
high enough -march= setting. (Runtime enabled features are out of scope 
for Clang for now as we don't want to try to pass individual higher 
-march= options to the individual assembly files.)

// Martin


More information about the ffmpeg-devel mailing list