[FFmpeg-devel] [PATCH 1/4] configure: aarch64: Support assembling the dotprod and i8mm arch extensions
Martin Storsjö
martin at martin.st
Tue May 30 15:25:25 EEST 2023
On Sun, 28 May 2023, Rémi Denis-Courmont wrote:
> Le sunnuntaina 28. toukokuuta 2023, 0.34.15 EEST Martin Storsjö a écrit :
>
>> I guess the alternative would be to just try to set .arch
>> <highest-supported-that-we-care-about>. I was worried that support for
>> e.g. armv8.6-a appeared later in toolchains than support for the
>> individual extension i8mm, but at least from a quick browse in binutils
>> history, they seem to have been added at the same time, so there's
>> probably no such drawback.
>>
>> Or what's the situation with e.g. SVE2 - was ".arch_extension sve2"
>> supported significantly earlier than ".arch armv9-a"?
>
> I have not tested SVE on LLVM. AFAIK, SVE and SVE2 are optional from 8.2 and
> 9.0 onward respectively, and not mandatory in any version, so if your
> toolchain supports neither .arch with plus sign, nor .arch_extension, it is
> game over.
I didn't meant specifically whether LLVM supports it here, just in general
wrt binutils and how to enable the feature.
FWIW it seems like SVE2 is a mandatory part of 9.0 - assembling SVE2
instructions can be done with ".arch armv9-a". But there are about 2 years
worth of deployed binutils based toolchains that do recognize ".arch
armv8.2-a; .arch_extension sve2" but don't recognize ".arch armv9-a".
So for the generic mechanism for enabling cpu features, I'd prefer to keep
the mechanism using primarily .arch_extension (with .arch set as high as
necessary) rather than relying solely on .arch <version> without any extra
+<feature>.
>> If we'd do that, it does simplify the configure logic a fair bit and
>> reduces the number of configure variables we need by a lot. It does enable
>> a few more instruction set extensions than what we need though, but that's
>> probably not a real issue.
>
> Yes.
I made an attempt at simplifying the logic in configure and asm.S
somewhat, while still primarily using .arch_extension, and while making
sure we still can get the features assembled with current Clang with a
high enough -march= setting. (Runtime enabled features are out of scope
for Clang for now as we don't want to try to pass individual higher
-march= options to the individual assembly files.)
// Martin
More information about the ffmpeg-devel
mailing list