SwScaler performance help (was Re: [MPlayer-dev-eng] [PATCH] vf_osd updates - fully baked?)

Jason Tackaberry tack at sault.org
Tue Sep 13 19:00:45 CEST 2005


On Tue, 2005-09-13 at 17:55 +0200, Reimar Döffinger wrote:
> Well, I didn't see your code yet *g*.

True, I'll post an update later today or tomorrow.

> But, the fastest way from what I saw would be if the input already was
> BGR (not BGRA) and a sepearate alpha plane. 

It's not really feasible to ask the client to send data that way.
That's a pretty peculiar arrangement and no image or canvas library
(which would be used on the client side) would support that. 

But you do make a good point that BGR24->YV12 is going to be faster
(because there is an unscaled special converter for it).  Since we're
looping over the BGRA buffer anyway to pull the alpha channel out, I
tested creating a BGR24 bitmap and use the BGR24->YV12 converter.  It's
now 30% faster than before (or just a touch over 2 times slower than the
existing custom colorspace conversion code).

More than 50% of the time now is in the loop that decomposes the BGRA
buffer into BGR24 and alpha plane: (ry is slice top, rh is slice height,
and w is OSD width)

    unsigned char *p_alpha = priv->alpha_tmp + (ry*w);
    unsigned char *p_bgr24 = priv->bgr24_tmp + (ry*w*3);
    for (i=(ry*w*4); i < (ry*w*4) + (w*rh*4); i += 4, p_bgr24 += 3) {
        *(uint32_t *)p_bgr24 = *(uint32_t *)&priv->bgra_imgbuf[i];
        *(p_alpha++) = priv->bgra_imgbuf[i+3];
    }

Maybe it could benefit from SSE, although I can't really see how.  (But
I also don't know SSE instructions very well.)

> This might also allow for very fast fade-in/fade-out...

There is a global alpha setting that can be used for fast fades.  This
way no colorspace conversion is needed at all.

> Providing an accelerated BGRA->YV12 conversion in swscaler might help as well.

I think this is the best possible solution, but doing that properly
would be a big undertaking for me.

I think the best plan is to use swscaler in vf_osd, making what's there
as tight as I can, and in the future I or someone else can optimize by
adding a proper BGRA->YV12 converter to swscaler.

Cheers,
Jason.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 229 bytes
Desc: This is a digitally signed message part
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20050913/f5000a61/attachment.pgp>


More information about the MPlayer-dev-eng mailing list