[FFmpeg-devel] [PATCH v2 3/5] aarch64: Add Linux runtime cpu feature detection using getauxval(AT_HWCAP)
Martin Storsjö
martin at martin.st
Wed May 31 22:37:56 EEST 2023
On Wed, 31 May 2023, Rémi Denis-Courmont wrote:
> Le tiistaina 30. toukokuuta 2023, 15.30.41 EEST Martin Storsjö a écrit :
>> Based partially on code by Janne Grunau.
>>
>> ---
>> Updated to use both the direct HWCAP* macros and HWCAP_CPUID. A
>> not unreasonably old distribution like Ubuntu 20.04 does have
>> HWCAP_CPUID but not HWCAP2_I8MM in the distribution provided headers.
>>
>> Alternatively I guess we could carry our own fallback hardcoded values
>> for the HWCAP* values we use and skip HWCAP_CPUID.
>> ---
>> configure | 2 ++
>> libavutil/aarch64/cpu.c | 63 +++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 65 insertions(+)
>>
>> diff --git a/configure b/configure
>> index 50eb27ba0e..b39de74de5 100755
>> --- a/configure
>> +++ b/configure
>> @@ -2209,6 +2209,7 @@ HAVE_LIST_PUB="
>>
>> HEADERS_LIST="
>> arpa_inet_h
>> + asm_hwcap_h
>> asm_types_h
>> cdio_paranoia_h
>> cdio_paranoia_paranoia_h
>> @@ -6432,6 +6433,7 @@ check_headers io.h
>> enabled libdrm &&
>> check_headers linux/dma-buf.h
>>
>> +check_headers asm/hwcap.h
>> check_headers linux/perf_event.h
>> check_headers libcrystalhd/libcrystalhd_if.h
>> check_headers malloc.h
>> diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c
>> index 0c76f5ad15..4563959ffd 100644
>> --- a/libavutil/aarch64/cpu.c
>> +++ b/libavutil/aarch64/cpu.c
>> @@ -20,6 +20,67 @@
>> #include "libavutil/cpu_internal.h"
>> #include "config.h"
>>
>> +#if (defined(__linux__) || defined(__ANDROID__)) && HAVE_GETAUXVAL &&
>> HAVE_ASM_HWCAP_H +#include <stdint.h>
>> +#include <asm/hwcap.h>
>> +#include <sys/auxv.h>
>> +
>> +#define get_cpu_feature_reg(reg, val) \
>> + __asm__("mrs %0, " #reg : "=r" (val))
>> +
>> +static int detect_flags(void)
>> +{
>> + int flags = 0;
>> + unsigned long hwcap, hwcap2;
>> +
>> + // Check for support using direct individual HWCAPs
>> + hwcap = getauxval(AT_HWCAP);
>> +#ifdef HWCAP_ASIMDDP
>> + if (hwcap & HWCAP_ASIMDDP)
>> + flags |= AV_CPU_FLAG_DOTPROD;
>> +#endif
>> +
>> +#ifdef AT_HWCAP2
>> + hwcap2 = getauxval(AT_HWCAP2);
>> +#ifdef HWCAP2_I8MM
>> + if (hwcap2 & HWCAP2_I8MM)
>> + flags |= AV_CPU_FLAG_I8MM;
>> +#endif
>> +#endif
>> +
>> + // Silence warnings if none of the hwcaps to check are known.
>> + (void)hwcap;
>> + (void)hwcap2;
>> +
>> +#if defined(HWCAP_CPUID)
>> + // The HWCAP_* defines for individual extensions may become available
>> late, as
>> + // they require updates to userland headers. As a fallback, see if we
> can access
>> + // the CPUID registers (trapped via the kernel).
>> + // See https://www.kernel.org/doc/html/latest/arm64/cpu-feature-registers.html
>
> I don't actually care which method is used and whether to hard-code the
> missing constants or not. But doing both methods is weird. If you are going to
> trigger the TID3 traps anyway, there is no point checking the auxillary
> vectors before, AFAICT.
Yeah, that's true.
> You *could* check the auxillary vectors as a run-time fallback if HWCAP_CPUID
> is *not* set, but that only really makes for HWCAP_FP and HWCAP_ASIMD, not for
> HWCAP_ASIMDDP (Linux 4.15) and HWCAP2_I8MM (Linux 5.6) which are more recent
> than HWCAP_CPUID (Linux 4.11). And then, that would be only in the corner case
> that FP and/or AdvSIMD were explicitly disabled since they are on by default
> for all AArch64 targets.
Yeah - I guess there's no potential configuration where a kernel does know
about HWCAP_CPUID and newer HWCAPs but has decided to set HWCAP_CPUID to 0
and not handle the trapping?
I considered falling back on the trapping CPUID codepath only if the
individual HWCAPs weren't detected/supported, but that soon becomes quite
a mess if we're adding more than a couple extensions.
So I guess after all that it's simplest to just go with CPUID, possibly
with a code comment that we could go with individual HWCAPs at some point
in the future if we want to simplify things and don't care about older
systems/toolchains.
// Martin
More information about the ffmpeg-devel
mailing list