[FFmpeg-devel] [PATCH v2] cpu: add a function for querying maximum required data alignment

wm4 nfxjfg at googlemail.com
Thu Sep 28 03:19:19 EEST 2017


On Wed, 27 Sep 2017 21:15:39 -0300
James Almer <jamrial at gmail.com> wrote:

> On 9/4/2017 5:53 PM, Michael Niedermayer wrote:
> > On Mon, Sep 04, 2017 at 09:27:32PM +0200, wm4 wrote:  
> >> On Mon, 4 Sep 2017 21:18:35 +0200
> >> Michael Niedermayer <michael at niedermayer.cc> wrote:
> >>  
> >>> On Sat, Sep 02, 2017 at 09:47:38PM -0300, James Almer wrote:  
> >>>> From: Anton Khirnov <anton at khirnov.net>
> >>>>
> >>>> (cherry picked from commit e6bff23f1e11aefb16a2b5d6ee72bf7469c5a66e)
> >>>> Signed-off-by: James Almer <jamrial at gmail.com>
> >>>> ---
> >>>> This is (afaics) the last API introduced to libav before the major bump.
> >>>>
> >>>> Now checking all the x86 flags that would require aligment of 16 bytes
> >>>> or more.
> >>>>
> >>>>  doc/APIchanges      |  3 +++
> >>>>  libavutil/cpu.c     | 35 +++++++++++++++++++++++++++++++++++
> >>>>  libavutil/cpu.h     | 13 +++++++++++++
> >>>>  libavutil/version.h |  2 +-
> >>>>  4 files changed, 52 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/doc/APIchanges b/doc/APIchanges
> >>>> index 4effbf9364..6a57c210a9 100644
> >>>> --- a/doc/APIchanges
> >>>> +++ b/doc/APIchanges
> >>>> @@ -15,6 +15,9 @@ libavutil:     2015-08-28
> >>>>  
> >>>>  API changes, most recent first:
> >>>>  
> >>>> +2017-09-xx - xxxxxxx - lavu 55.75.100 / lavu 55.31.0 - cpu.h
> >>>> +  Add av_cpu_max_align() for querying maximum required data alignment.
> >>>> +
> >>>>  2017-09-01 - xxxxxxx - lavf 57.81.100 - avio.h
> >>>>    Add avio_read_partial().
> >>>>  
> >>>> diff --git a/libavutil/cpu.c b/libavutil/cpu.c
> >>>> index a22da0fa8c..4f04da2460 100644
> >>>> --- a/libavutil/cpu.c
> >>>> +++ b/libavutil/cpu.c
> >>>> @@ -16,9 +16,11 @@
> >>>>   * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> >>>>   */
> >>>>  
> >>>> +#include <stddef.h>
> >>>>  #include <stdint.h>
> >>>>  #include <stdatomic.h>
> >>>>  
> >>>> +#include "attributes.h"
> >>>>  #include "cpu.h"
> >>>>  #include "cpu_internal.h"
> >>>>  #include "config.h"
> >>>> @@ -299,3 +301,36 @@ int av_cpu_count(void)
> >>>>  
> >>>>      return nb_cpus;
> >>>>  }
> >>>> +
> >>>> +size_t av_cpu_max_align(void)
> >>>> +{
> >>>> +    int av_unused flags = av_get_cpu_flags();
> >>>> +
> >>>> +#if ARCH_ARM || ARCH_AARCH64
> >>>> +    if (flags & AV_CPU_FLAG_NEON)
> >>>> +        return 16;
> >>>> +#elif ARCH_PPC
> >>>> +    if (flags & AV_CPU_FLAG_ALTIVEC)
> >>>> +        return 16;
> >>>> +#elif ARCH_X86
> >>>> +    if (flags & (AV_CPU_FLAG_AVX2     |
> >>>> +                 AV_CPU_FLAG_AVX      |
> >>>> +                 AV_CPU_FLAG_FMA4     |
> >>>> +                 AV_CPU_FLAG_FMA3))
> >>>> +        return 32;
> >>>> +    if (flags & (AV_CPU_FLAG_XOP      |
> >>>> +                 AV_CPU_FLAG_AESNI    |
> >>>> +                 AV_CPU_FLAG_SSE42    |
> >>>> +                 AV_CPU_FLAG_SSE4     |
> >>>> +                 AV_CPU_FLAG_SSSE3    |
> >>>> +                 AV_CPU_FLAG_SSE3     |
> >>>> +                 AV_CPU_FLAG_SSE2     |
> >>>> +                 AV_CPU_FLAG_SSE      |
> >>>> +                 AV_CPU_FLAG_AVXSLOW  |
> >>>> +                 AV_CPU_FLAG_SSE3SLOW |
> >>>> +                 AV_CPU_FLAG_SSE2SLOW))
> >>>> +        return 16;
> >>>> +#endif
> >>>> +
> >>>> +    return 8;
> >>>> +}
> >>>> diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> >>>> index de05593446..9e5d40affe 100644
> >>>> --- a/libavutil/cpu.h
> >>>> +++ b/libavutil/cpu.h
> >>>> @@ -21,6 +21,8 @@
> >>>>  #ifndef AVUTIL_CPU_H
> >>>>  #define AVUTIL_CPU_H
> >>>>  
> >>>> +#include <stddef.h>
> >>>> +
> >>>>  #include "attributes.h"
> >>>>  
> >>>>  #define AV_CPU_FLAG_FORCE    0x80000000 /* force usage of selected flags (OR) */    
> >>>
> >>>  
> >>>> @@ -113,4 +115,15 @@ int av_parse_cpu_caps(unsigned *flags, const char *s);
> >>>>   */
> >>>>  int av_cpu_count(void);
> >>>>  
> >>>> +/**
> >>>> + * Get the maximum data alignment that may be required by FFmpeg.
> >>>> + *
> >>>> + * Note that this is affected by the build configuration and the CPU flags mask,
> >>>> + * so e.g. if the CPU supports AVX, but libavutil has been built with
> >>>> + * --disable-avx or the AV_CPU_FLAG_AVX flag has been disabled through
> >>>> + *  av_set_cpu_flags_mask(), then this function will behave as if AVX is not
> >>>> + *  present.
> >>>> + */
> >>>> +size_t av_cpu_max_align(void);    
> >>>
> >>> This might interact badly with runtime cpu flags/mask changes
> >>>
> >>> If its used to choose the alignment for allocated frames and
> >>> after some are allocated the cpu flags are changed there could be
> >>> still frames in queues that may not have sufficient alignment for the
> >>> new flags  
> >>
> >> There's no such thing as runtime CPU flag changes.  
> > 
> > We even have an API to change the cpu flags at runtime.
> > 
> > av_set_cpu_flags_mask() and av_force_cpu_flags()
> > 
> > There is no restriction in the API on when they can be called.
> > 
> > you can call av_force_cpu_flags(0) then open a decoder then call
> > av_force_cpu_flags(AV_CPU_FLAG_MMX|AV_CPU_FLAG_SSE)
> > then open a filter or encoder
> > then run the code and it could crash as the allocated frames from
> > earlier are not aligned enough for the 2nd filter or encoder
> > 
> > There also may be other scenarios where this can occur  
> 
> Alright, I just arrived to this cherry pick during merges. So my
> question is, do i apply it and skip the commit that makes use of it to
> choose alignment within ffmpeg?
> 
> Even if unused at first, I'd rather not skip its addition.

API users can rely on it. So you better make sure that can't break.
Other aspects don't really matter.


More information about the ffmpeg-devel mailing list