SwScaler performance help (was Re: [MPlayer-dev-eng] [PATCH] vf_osd updates - fully baked?)
Jason Tackaberry
tack at sault.org
Tue Sep 13 19:00:45 CEST 2005
On Tue, 2005-09-13 at 17:55 +0200, Reimar Döffinger wrote:
> Well, I didn't see your code yet *g*.
True, I'll post an update later today or tomorrow.
> But, the fastest way from what I saw would be if the input already was
> BGR (not BGRA) and a sepearate alpha plane.
It's not really feasible to ask the client to send data that way.
That's a pretty peculiar arrangement and no image or canvas library
(which would be used on the client side) would support that.
But you do make a good point that BGR24->YV12 is going to be faster
(because there is an unscaled special converter for it). Since we're
looping over the BGRA buffer anyway to pull the alpha channel out, I
tested creating a BGR24 bitmap and use the BGR24->YV12 converter. It's
now 30% faster than before (or just a touch over 2 times slower than the
existing custom colorspace conversion code).
More than 50% of the time now is in the loop that decomposes the BGRA
buffer into BGR24 and alpha plane: (ry is slice top, rh is slice height,
and w is OSD width)
unsigned char *p_alpha = priv->alpha_tmp + (ry*w);
unsigned char *p_bgr24 = priv->bgr24_tmp + (ry*w*3);
for (i=(ry*w*4); i < (ry*w*4) + (w*rh*4); i += 4, p_bgr24 += 3) {
*(uint32_t *)p_bgr24 = *(uint32_t *)&priv->bgra_imgbuf[i];
*(p_alpha++) = priv->bgra_imgbuf[i+3];
}
Maybe it could benefit from SSE, although I can't really see how. (But
I also don't know SSE instructions very well.)
> This might also allow for very fast fade-in/fade-out...
There is a global alpha setting that can be used for fast fades. This
way no colorspace conversion is needed at all.
> Providing an accelerated BGRA->YV12 conversion in swscaler might help as well.
I think this is the best possible solution, but doing that properly
would be a big undertaking for me.
I think the best plan is to use swscaler in vf_osd, making what's there
as tight as I can, and in the future I or someone else can optimize by
adding a proper BGRA->YV12 converter to swscaler.
Cheers,
Jason.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 229 bytes
Desc: This is a digitally signed message part
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20050913/f5000a61/attachment.pgp>
More information about the MPlayer-dev-eng
mailing list