[FFmpeg-devel] [PATCH 1/7] x86: hevc_mc: add AVX2 optimizations

Christophe Gisquet christophe.gisquet at gmail.com
Fri Feb 6 08:41:24 CET 2015


2015-02-06 1:15 GMT+01:00 James Almer <jamrial at gmail.com>:
> On 05/02/15 4:20 PM, Christophe Gisquet wrote:
>> From: plepere <pierre-edouard.lepere at insa-rennes.fr>
>
> This should probably be changed to Pierre Edouard Lepere.

Yeah, I amended with --author=lepere and that's what I got: it's what
appears in 9ba6b17add2, 942e22c651, 92cccb7bcd and in fact any he
submitted.

But he's no longer in INSA, so the mail part is indeed less relevant.

>> +%if cpuflag(avx2) && (%0 == 3)
>> +
>> +    vextracti128 xm10, m0, 1
>> +    vinserti128 m10, m1, xm10, 0
>> +    vinserti128 m0, m0, xm1, 1
>> +    mova m1, m10
>> +
>> +    vextracti128 xm10, m2, 1
>> +    vinserti128 m10, m3, xm10, 0
>> +    vinserti128 m2, m2, xm3, 1
>> +    mova m3, m10
>> +
>> +
>> +    vextracti128 xm10, m4, 1
>> +    vinserti128 m10, m5, xm10, 0
>> +    vinserti128 m4, m4, xm5, 1
>> +    mova m5, m10
>> +
>> +    vextracti128 xm10, m6, 1
>> +    vinserti128 m10, m7, xm10, 0
>> +    vinserti128 m6, m6, xm7, 1
>> +    mova m7, m10
>> +%endif
>
> I didn't check but i think these can be simplified using vperm2i128.
> It can be done in a separate patch anyway.

I'd prefer so, because I don't know avx2, so I can neither apply your
comments, nor review.

One think you may look also is that QPEL_HV lacks the shuffling that
QPEL has for 8 bits. Consequently, there's now qpel_hv 16-wide avx2
version. That may also explain why OpenHEVC didn't get much speed
improvement from avx2 on 8 bits.

I don't know if it is feasible.

> It would be nice all this was compressed to a couple macros like with SSE4. But that's
> cosmetics and not a blocker.

Yeah, I did tell myself it was for another patch.

> Should be ok if it passes fate

I think both you and Mickael validated it.

> and compiles with yasm <= 1.1.0 (there are C wrappers
> and those usually need more strict checks for HAVE_AVX2_EXTERNAL because dead code
> elimination doesn't seem to trigger until after pre-processing is done).

Is it equivalent to setting HAVE_AVX2_EXTERNAL to 0/!yes in config.*?
Because doing so results in no avx2 function and no link issue, as
should be the case, I guess.

-- 
Christophe


More information about the ffmpeg-devel mailing list