[Ffmpeg-devel] [RFC] RGB48 support

Tue Apr 10 23:13:30 CEST 2007

Hi

On Tue, Apr 10, 2007 at 06:15:35PM +0200, Ivo wrote:
> Hi,
> 
> On Monday 09 April 2007 23:46, Michael Niedermayer wrote:
> > On Mon, Apr 09, 2007 at 08:16:30PM +0200, Ivo wrote:
> > > Any comments?
> >
> > the only thing in swscale which you must implement is the new format->yuv
> > code, swscale uses yuv internally (see svn log of swscale for some
> > examples on how other formats where added)
> >
> > the rgb->rgb code is optional but surely a good idea ...
> >
> > also the rgb48->yuv can be done in 2 ways
> > A. -> 8bit based yuv
> > B. -> internal yuv (which uses >8bits per component) but will need
> >    slightly more work, also theres no example in svn for this yet ...
> >
> > PS: this has nothing to do with yuv2rgb*.c ! the code you need to change
> > is in swscale*
> 
> Here's my progress. I implemented conversion both to and from 8-bit based 
> yuv for now (your option A.) so in the end I did touch yuv2rgb*.c too ;-)
> 
> I noticed several lines of:
> 
> src2= formatConvBuffer+2048;
> 
> in swscale_template.c. If I understand the code correctly, this constant is 
> arbitrarily chosen and swscale will fail if you feed it very large images 
> (like an 8k 70mm film scan). Is that correct?

yes, the 2048 stuff should be changed to a #defined constant (or variable but
that will be more difficult and likely somewhat slower, so constant is better)

> 
> I will continue to work on the missing rgb2rgb functions, including some MMX 
> acceleration for some of the important ones. 

:)

> Any comments on the code so 

did you test the yuv2rgb.c code ?

[...]
> Index: libavutil/avutil.h
> ===================================================================
> --- libavutil/avutil.h	(revision 8705)
> +++ libavutil/avutil.h	(working copy)
> @@ -107,6 +107,8 @@
>  
>      PIX_FMT_GRAY16BE,  ///<        Y        , 16bpp, big-endian
>      PIX_FMT_GRAY16LE,  ///<        Y        , 16bpp, little-endian
> +    PIX_FMT_RGB48BE,   ///< Packed RGB 16:16:16, 48bpp, big-endian
> +    PIX_FMT_RGB48LE,   ///< Packed RGB 16:16:16, 48bpp, little-endian

iam fine with this but i think the comments could be clearer

[...]
> +static inline void RENAME(rgb48to48)(const uint8_t *src, uint8_t *dst, long src_size)
> +{
> +    uint8_t *d = dst, *s = (uint8_t *) src;
> +    const uint8_t *end = s + src_size;
> +
> +    for (; s<end; s+=2, d+=2) {
> +        *d     = *(s+1);
> +        *(d+1) = *s;
> +    }
> +}

hmm try:
unsigned int a= *(uint32_t*)s;
*(uint32_t*)d= ((a>>8)&0x00FF00FF) + ((a<<8)&0xFF00FF00);

> +
> +static inline void RENAME(rgb48beto24)(const uint8_t *src, uint8_t *dst, long src_size)
> +{
> +    uint8_t *d = dst, *s = (uint8_t *) src;
> +    const uint8_t *end = s + src_size;
> +
> +    for (; s<end; s+=2, d++)
> +        *d = *s;
> +}
> +
> +static inline void RENAME(rgb48leto24)(const uint8_t *src, uint8_t *dst, long src_size)
> +{
> +    uint8_t *d = dst, *s = (uint8_t *) src;
> +    const uint8_t *end = s + src_size;
> +
> +    for (; s<end; s+=2, d++)
> +        *d = *(s+1);
> +}

hmmmmm, isnt this the same as the gray conversation stuff? if yes please
avoid the code duplication / use it for both gray and rgb

> +
> +static inline void RENAME(rgb24to48)(const uint8_t *src, uint8_t *dst, long src_size)
> +{
> +    uint8_t *d = dst, *s = (uint8_t *) src;
> +    const uint8_t *end = s + src_size;
> +    {START_TIMER
> +#ifdef HAVE_MMX
> +    __asm __volatile(PREFETCH" %0"    : :"m"(*s) :"memory");
> +    while(s < end - 23)
> +    {
> +    __asm __volatile(
> +		PREFETCH"	32%1		\n"
> +	"	movd	 	  %1,   %%mm0	\n"
> +	"	movd	 	 4%1,   %%mm1	\n"
> +	"	movd 		 8%1,   %%mm2	\n"
> +	"	movd 		12%1,   %%mm3	\n"
> +	"	movd 		16%1,   %%mm4	\n"
> +	"	movd 		20%1,   %%mm5	\n"

> +	"	movq		%%mm0,  %%mm6	\n"
> +	"	movq		%%mm1,  %%mm7	\n"
> +	"	punpcklbw	%%mm0,  %%mm6	\n"
> +	"	punpcklbw	%%mm1,  %%mm7	\n"

hmm try:
punpcklbw	%%mm0,  %%mm0

[...]
> +    d += 48;
> +    s += 24;
> +    }

please write the whole loop in asm, i hate c-loop + asm code as gcc tends
to make a quite suboptimal mess out of it most of the time ...

> +    __asm __volatile(SFENCE:::"memory");
> +    __asm __volatile(EMMS:::"memory");
> +#endif
> +    for (; s<end; s++, d+=2) {
> +        register uint8_t x = *s;

a simple int should do

> +        *((uint16_t *)d) = (x << 8) + x;
> +    }
> +    STOP_TIMER("rgb24to48")}

iam happy that you benchmark your code but this shouldnt be in the patches
you send :)

[...]
> +static inline void RENAME(rgb48leToY)(uint8_t *dst, uint8_t *src, int width)
> +{
> +	int i;
> +	for(i=0; i<width; i++)
> +	{
> +		int r= src[i*6+1];
> +		int g= src[i*6+3];
> +		int b= src[i*6+5];
> +
> +		dst[i]= ((RY*r + GY*g + BY*b + (33<<(RGB2YUV_SHIFT-1)) )>>RGB2YUV_SHIFT);
> +	}
> +}

code duplication rgb48beToY(+1) will work too (of course alternative 
high accuracy code which scales the 16bits rgb to internal yuv is welcome 
too ...)

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No great genius has ever existed without some touch of madness. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070410/dc514d6c/attachment.pgp>