[FFmpeg-devel] [PATCH] avfilter/vf_blend: add x86 SIMD for some modes

Henrik Gramner henrik at gramner.com
Fri Oct 2 19:48:24 CEST 2015


On Fri, Oct 2, 2015 at 6:57 PM, Paul B Mahol <onemda at gmail.com> wrote:
> +INIT_XMM sse2
> +cglobal blend_xor, 9, 10, 2, 0, top, top_linesize, bottom, bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +cglobal blend_or, 9, 10, 2, 0, top, top_linesize, bottom, bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +cglobal blend_and, 9, 10, 2, 0, top, top_linesize, bottom, bottom_linesize, dst, dst_linesize, width, start, end

You could do those using floating point operations (xorps, orps,
andps), then you only need SSE instead of SSE2 (and AVX instead of
AVX2 if you want to make versions using ymm registers).

> +cglobal blend_addition, 9, 10, 3, 0, top, top_linesize, bottom, bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +        punpcklbw       m0, m2
> +        punpcklbw       m1, m2
> +        paddw           m0, m1
> +        packuswb        m0, m0
> +        movh    [dstq + x], m0
> +        add           r10q, mmsize / 2

paddusb

> +cglobal blend_subtract, 9, 10, 3, 0, top, top_linesize, bottom, bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +        punpcklbw       m0, m2
> +        punpcklbw       m1, m2
> +        psubw           m0, m1
> +        packuswb        m0, m0

psubusb

> +cglobal blend_darken, 9, 10, 2, 0, top, top_linesize, bottom, bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +        movh            m0, [topq + x]
> +        movh            m1, [bottomq + x]
> +        pminub          m0, m1
> +        movh    [dstq + x], m0
[...]
> +cglobal blend_lighten, 9, 10, 2, 0, top, top_linesize, bottom, bottom_linesize, dst, dst_linesize, width, start, end
[...]
> +        movh            m0, [topq + x]
> +        movh            m1, [bottomq + x]
> +        pmaxub          m0, m1
> +        movh    [dstq + x], m0

You're only utilizing the lower half the registers here.


More information about the ffmpeg-devel mailing list