[FFmpeg-devel] [PATCH 2/2] avutil/float_dsp: add ff_vector_dmul_{sse2, avx}
James Almer
jamrial at gmail.com
Fri Sep 14 16:26:46 EEST 2018
On 9/14/2018 9:57 AM, Henrik Gramner wrote:
> On Thu, Sep 13, 2018 at 3:08 PM, James Almer <jamrial at gmail.com> wrote:
>> + lea lenq, [lend*8 - mmsize*4]
>
> Is len guaranteed to be a multiple of mmsize/8? Otherwise this would
> cause misalignment. It will also break if len < mmsize/2.
len must be a multiple of 16 as per the doxy, so yes.
The only way for len to be < mmsize/2 is if we add an avx512 version.
>
> Also if you want a 32-bit result from lea it should be written as "lea
> lend, [lenq*8 - mmsize*4]" which is equivalent but has a shorter
> opcode (e.g. always use native sizes within brackets).
len is an int, so I assume this is only possible here because it's an
argument passed in a reg and not stack? Otherwise, the upper 32bits
would probably make a mess with the multiplication. See for example
vector_fmul_add where len is the fifth argument.
More information about the ffmpeg-devel
mailing list