[mplayer] software scaling with DGA, MMX coder wanted!
Arpi
arpi at thot.banki.hu
Mon Mar 5 10:32:32 CET 2001
Hi,
I've just finished the software scaling version of Andreas Ackermann's
DGA driver. It works only with YV12 codecs (mpeg 1/2, opendivx) now.
It scale image in the YUV space and finally converts to RGB. So the
quality is the highest possible. (I'm using bilinear scaling!)
The only problem is the speed. It eats 108% of my P3-900 playing
a 352x288 mpeg1 movie scaled to 800x600. :-(
Someone with MMX/assembly knowledge should rewrite some inner-loops
in the code. I've commented them in the source, you'll find...
Now everything is in C. The most speed-critic parts:
// this loop should be rewritten in MMX assembly!!!!
for(i=0;i<vo_dga_vp_width;i++){
register unsigned int xx=xpos>>8;
register unsigned int xalpha=xpos&0xFF;
buf1[i]=(src[xx]*(xalpha^255)+src[xx+1]*xalpha);
xpos+=dga_xinc;
}
// this loop should be rewritten in MMX assembly!!!!
for(i=0;i<vo_dga_vp_width;i++){
register unsigned int xx=xpos>>8;
register unsigned int xalpha=xpos&0xFF;
uvbuf1[i]=(src1[xx]*(xalpha^255)+src1[xx+1]*xalpha);
uvbuf1[i+2048]=(src2[xx]*(xalpha^255)+src2[xx+1]*xalpha);
xpos+=dga_xinc2;
}
// this loop should be rewritten in MMX assembly!!!!
for(i=0;i<vo_dga_vp_width;i++){
// linear interpolation && yuv2rgb in a single step:
int Y=yuvtab_2568[((buf0[i]*yalpha1+buf1[i]*yalpha)>>16)];
int U=((uvbuf0[i]*uvalpha1+uvbuf1[i]*uvalpha)>>16);
int V=((uvbuf0[i+2048]*uvalpha1+uvbuf1[i+2048]*uvalpha)>>16);
dest[0]=clip_table[((Y + yuvtab_3343[U]) >>13)];
dest[1]=clip_table[((Y + yuvtab_0c92[V] + yuvtab_1a1e[U]) >>13)];
dest[2]=clip_table[((Y + yuvtab_40cf[V]) >>13)];
dest+=vo_dga_bpp;
}
I think these are easy job for an experienced mmx/asm coder, and
can help a lot in speed!
Contact me if you are interested in rewritting/optimizing these!
A'rpi / Astral & ESP-team
--
mailto:arpi at thot.banki.hu
http://esp-team.scene.hu
More information about the MPlayer-users
mailing list