[FFmpeg-devel] [PATCH] x86/dsputil: port clear_block functions to yasm
James Almer
jamrial at gmail.com
Wed May 21 18:42:42 CEST 2014
On 21/05/14 4:43 AM, Christophe Gisquet wrote:
> Hi,
>
> 2014-05-21 8:53 GMT+02:00 James Almer <jamrial at gmail.com>:
>> +INIT_XMM sse
>> +%define ZERO xorps
>> +CLEAR_BLOCK 1, 1
> [...]
>> +INIT_XMM sse
>> +%define ZERO xorps
>> +CLEAR_BLOCKS 1
>
> Maybe it crossed your mind and then you crossed it out for lack of
> benefit, but a sse2 and even maybe an avx version might make sense?
Tried an AVX version, but it seems the blocks are 16-byte aligned because
it crashed on me.
Didn't look too much into it, though.
And not sure if an SSE2 version is worth it. The function is not a critical
one (and mostly used by vc1) and xorps -> pxor, movaps -> movdqa will probably
not make that much of a difference.
>
>> +#if HAVE_YASM
>> +#if HAVE_SSE_EXTERNAL
>
> From the discussion on HAVE_MMX_EXTERNAL, I would expect
> HAVE_SSE_EXTERNAL implies HAVE_YASM.
> Probably needs a confirmation from someone knowing what he's talking
> about (i.e. not me).
Yes it does, but the HAVE_YASM check there covers other stuff below the
initializations i added, so i left it in place.
> Otherwise OK, this is a straightforward conversion.
>
> Best regards,
>
More information about the ffmpeg-devel
mailing list