[FFmpeg-devel] [PATCH] avutil/mem: use C11 aligned_malloc()

Sun Feb 18 18:27:35 EET 2024

James Almer:
> Save for the Microsoft C Runtime library, where free() can't handle aligned
> buffers, aligned_malloc() should be available and working on all supported
> targets.
> Also, malloc() alone may be sufficient if alignment requirement is low, so add
> a check for it.
> 
> Signed-off-by: James Almer <jamrial at gmail.com>
> ---
>  configure       |  2 --
>  libavutil/mem.c | 42 ++++++------------------------------------
>  2 files changed, 6 insertions(+), 38 deletions(-)
> 
> diff --git a/configure b/configure
> index 7c45ac25c8..8fd2895ac2 100755
> --- a/configure
> +++ b/configure
> @@ -6450,8 +6450,6 @@ if test -n "$custom_allocator"; then
>  fi
>  
>  check_func_headers malloc.h _aligned_malloc     && enable aligned_malloc
> -check_func  ${malloc_prefix}memalign            && enable memalign
> -check_func  ${malloc_prefix}posix_memalign      && enable posix_memalign
>  
>  check_func  access
>  check_func_headers stdlib.h arc4random_buf
> diff --git a/libavutil/mem.c b/libavutil/mem.c
> index 36b8940a0c..a72981d1ab 100644
> --- a/libavutil/mem.c
> +++ b/libavutil/mem.c
> @@ -100,44 +100,14 @@ void *av_malloc(size_t size)
>      if (size > atomic_load_explicit(&max_alloc_size, memory_order_relaxed))
>          return NULL;
>  
> -#if HAVE_POSIX_MEMALIGN
> -    if (size) //OS X on SDK 10.6 has a broken posix_memalign implementation
> -    if (posix_memalign(&ptr, ALIGN, size))
> -        ptr = NULL;
> -#elif HAVE_ALIGNED_MALLOC
> +#if HAVE_ALIGNED_MALLOC
>      ptr = _aligned_malloc(size, ALIGN);
> -#elif HAVE_MEMALIGN
> -#ifndef __DJGPP__
> -    ptr = memalign(ALIGN, size);
> -#else
> -    ptr = memalign(size, ALIGN);
> -#endif
> -    /* Why 64?
> -     * Indeed, we should align it:
> -     *   on  4 for 386
> -     *   on 16 for 486
> -     *   on 32 for 586, PPro - K6-III
> -     *   on 64 for K7 (maybe for P3 too).
> -     * Because L1 and L2 caches are aligned on those values.
> -     * But I don't want to code such logic here!
> -     */
> -    /* Why 32?
> -     * For AVX ASM. SSE / NEON needs only 16.
> -     * Why not larger? Because I did not see a difference in benchmarks ...
> -     */
> -    /* benchmarks with P3
> -     * memalign(64) + 1          3071, 3051, 3032
> -     * memalign(64) + 2          3051, 3032, 3041
> -     * memalign(64) + 4          2911, 2896, 2915
> -     * memalign(64) + 8          2545, 2554, 2550
> -     * memalign(64) + 16         2543, 2572, 2563
> -     * memalign(64) + 32         2546, 2545, 2571
> -     * memalign(64) + 64         2570, 2533, 2558
> -     *
> -     * BTW, malloc seems to do 8-byte alignment by default here.
> -     */
>  #else
> -    ptr = malloc(size);
> +    // malloc may already allocate sufficiently aligned buffers
> +    if (ALIGN > _Alignof(max_align_t))
> +        ptr = aligned_malloc(size, ALIGN);
> +    else
> +        ptr = malloc(size);
>  #endif
>      if(!ptr && !size) {
>          size = 1;

1. The function is called aligned_alloc (how did you test this?).
2. C11: "The value of alignment shall be a valid alignment supported by
the implementation and the value of size shall be an integral multiple
of alignment."
a) To use this, you would have to round size upwards; but this will make
sanitiziers more lenient.
b) If ALIGN is just not supported by the implementation, then everything
is UB in C11.
3. What's the advantage of this patch anyway?

- Andreas