[FFmpeg-devel] avfilter/x86/vf_blend : add avx2 for 8b func (v2)
Martin Vignali
martin.vignali at gmail.com
Sun Jan 28 21:27:21 EET 2018
2018-01-17 21:13 GMT+01:00 Martin Vignali <martin.vignali at gmail.com>:
> Hello,
>
>
> New patch in attach
>
> with modification in average, grain extract, multiply, screen, grain merge
>
>
> -- blend Average --
> Prev patch :
> average_c: 15605.4
> average_sse2: 1205.9
> average_avx2: 772.4
>
> New patch :
> average_c: 15604.4
> average_sse2: 490.9
> average_avx2: 265.2
>
> With 3 operand :
> using
> %if cpuflag(avx)
> pxor m0, m2, [topq + xq]
> pxor m1, m2, [bottomq + xq]
> %else
> movu m0, [topq + xq]
> movu m1, [bottomq + xq]
> pxor m0, m2
> pxor m1, m2
> %endif
>
> average_c: 15615.5
> average_sse2: 456.2
> average_avx: 553.7
> average_avx2: 387.0
>
>
> And for grain extract, multiply, screen, grain merge
> using mmsize process at each loop (instead of mmsize / 2)
>
> -- Grain extract --
> Prev :
> grainextract_c: 22182.9
> grainextract_sse2: 1158.9
> grainextract_avx2: 777.6
>
> New :
> grainextract_c: 22206.5
> grainextract_sse2: 964.8
> grainextract_avx2: 485.3
>
> -- Multiply --
> Prev :
> multiply_c: 41347.8
> multiply_sse2: 1376.0
> multiply_avx2: 840.0
>
> New :
> multiply_c: 40432.5
> multiply_sse2: 1248.0
> multiply_avx2: 635.0
>
> -- Screen --
> Prev :
> screen_c: 21635.8
> screen_sse2: 1801.5
> screen_avx2: 1069.8
>
> New :
> screen_c: 21617.0
> screen_sse2: 1625.7
> screen_avx2: 840.2
>
> -- Grain merge --
> Prev :
> grainmerge_c: 25233.5
> grainmerge_sse2: 1158.0
> grainmerge_avx2: 775.7
>
> New :
> grainmerge_c: 25246.7
> grainmerge_sse2: 967.4
> grainmerge_avx2: 487.7
>
>
> Martin
>
Pushed
Martin
More information about the ffmpeg-devel
mailing list