[FFmpeg-devel] [PATCH] cpu: add a function for querying maximum required data alignment
James Almer
jamrial at gmail.com
Sat Sep 2 21:48:40 EEST 2017
On 9/2/2017 3:29 PM, Clément Bœsch wrote:
> On Sat, Sep 02, 2017 at 02:07:01PM -0300, James Almer wrote:
> [...]
>> +size_t av_cpu_max_align(void)
>> +{
>> + int av_unused flags = av_get_cpu_flags();
>> +
>> +#if ARCH_ARM || ARCH_AARCH64
>> + if (flags & AV_CPU_FLAG_NEON)
>> + return 16;
>> +#elif ARCH_PPC
>> + if (flags & AV_CPU_FLAG_ALTIVEC)
>> + return 16;
>
>> +#elif ARCH_X86
>> + if (flags & AV_CPU_FLAG_AVX)
>> + return 32;
>> + if (flags & AV_CPU_FLAG_SSE)
>> + return 16;
>> +#endif
>
> mmh, will this really work in FFmpeg? I think we have a difference related
> to the flags dependency. Typically, if having SSE2 doesn't imply you have
> SSE. I think you may want to extend the mask.
Mmh, you're right, forgot we have av_parse_cpu_caps().
What do i do then? Define two masks with all the CPU flags that would
apply for each alignment value?
AVX to AVX2 plus FMA3/4 and the slow variants for 32, then SSE to SSE4
plus XOP and the slow variants for 16?
>
> [...]
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
More information about the ffmpeg-devel
mailing list