[FFmpeg-devel] [PATCH v2 3/5] aarch64: Add Linux runtime cpu feature detection using getauxval(AT_HWCAP)

Rémi Denis-Courmont remi at remlab.net
Wed May 31 19:54:01 EEST 2023


Le tiistaina 30. toukokuuta 2023, 15.30.41 EEST Martin Storsjö a écrit :
> Based partially on code by Janne Grunau.
> 
> ---
> Updated to use both the direct HWCAP* macros and HWCAP_CPUID. A
> not unreasonably old distribution like Ubuntu 20.04 does have
> HWCAP_CPUID but not HWCAP2_I8MM in the distribution provided headers.
> 
> Alternatively I guess we could carry our own fallback hardcoded values
> for the HWCAP* values we use and skip HWCAP_CPUID.
> ---
>  configure               |  2 ++
>  libavutil/aarch64/cpu.c | 63 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 65 insertions(+)
> 
> diff --git a/configure b/configure
> index 50eb27ba0e..b39de74de5 100755
> --- a/configure
> +++ b/configure
> @@ -2209,6 +2209,7 @@ HAVE_LIST_PUB="
> 
>  HEADERS_LIST="
>      arpa_inet_h
> +    asm_hwcap_h
>      asm_types_h
>      cdio_paranoia_h
>      cdio_paranoia_paranoia_h
> @@ -6432,6 +6433,7 @@ check_headers io.h
>  enabled libdrm &&
>      check_headers linux/dma-buf.h
> 
> +check_headers asm/hwcap.h
>  check_headers linux/perf_event.h
>  check_headers libcrystalhd/libcrystalhd_if.h
>  check_headers malloc.h
> diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c
> index 0c76f5ad15..4563959ffd 100644
> --- a/libavutil/aarch64/cpu.c
> +++ b/libavutil/aarch64/cpu.c
> @@ -20,6 +20,67 @@
>  #include "libavutil/cpu_internal.h"
>  #include "config.h"
> 
> +#if (defined(__linux__) || defined(__ANDROID__)) && HAVE_GETAUXVAL &&
> HAVE_ASM_HWCAP_H +#include <stdint.h>
> +#include <asm/hwcap.h>
> +#include <sys/auxv.h>
> +
> +#define get_cpu_feature_reg(reg, val) \
> +        __asm__("mrs %0, " #reg : "=r" (val))
> +
> +static int detect_flags(void)
> +{
> +    int flags = 0;
> +    unsigned long hwcap, hwcap2;
> +
> +    // Check for support using direct individual HWCAPs
> +    hwcap = getauxval(AT_HWCAP);
> +#ifdef HWCAP_ASIMDDP
> +    if (hwcap & HWCAP_ASIMDDP)
> +        flags |= AV_CPU_FLAG_DOTPROD;
> +#endif
> +
> +#ifdef AT_HWCAP2
> +    hwcap2 = getauxval(AT_HWCAP2);
> +#ifdef HWCAP2_I8MM
> +    if (hwcap2 & HWCAP2_I8MM)
> +        flags |= AV_CPU_FLAG_I8MM;
> +#endif
> +#endif
> +
> +    // Silence warnings if none of the hwcaps to check are known.
> +    (void)hwcap;
> +    (void)hwcap2;
> +
> +#if defined(HWCAP_CPUID)
> +    // The HWCAP_* defines for individual extensions may become available
> late, as
> +    // they require updates to userland headers. As a fallback, see if we 
can access
> +    // the CPUID registers (trapped via the kernel).
> +    // See https://www.kernel.org/doc/html/latest/arm64/cpu-feature-registers.html

I don't actually care which method is used and whether to hard-code the 
missing constants or not. But doing both methods is weird. If you are going to 
trigger the TID3 traps anyway, there is no point checking the auxillary 
vectors before, AFAICT.

You *could* check the auxillary vectors as a run-time fallback if HWCAP_CPUID 
is *not* set, but that only really makes for HWCAP_FP and HWCAP_ASIMD, not for 
HWCAP_ASIMDDP (Linux 4.15) and HWCAP2_I8MM (Linux 5.6) which are more recent 
than HWCAP_CPUID (Linux 4.11). And then, that would be only in the corner case 
that FP and/or AdvSIMD were explicitly disabled since they are on by default 
for all AArch64 targets.

-- 
Реми Дёни-Курмон
http://www.remlab.net/





More information about the ffmpeg-devel mailing list