[FFmpeg-devel] [PATCH] dnxhd get_pixels_4x8_sym sse

Thu Dec 11 03:22:15 CET 2008

Michael Niedermayer wrote:
> On Wed, Dec 10, 2008 at 05:39:05PM -0800, Baptiste Coudurier wrote:
>> Hi,
>>
>> $subject, I don't polute dsputil more since this is used only by dnxhd
>> encoder AFAIK.
> [...]
> 
>> @@ -158,8 +159,13 @@
>>  
>>      dsputil_init(&ctx->m.dsp, avctx);
>>      ff_dct_common_init(&ctx->m);
>> +#ifdef HAVE_MMX
>> +    ff_dnxhd_init_mmx(ctx);
>> +#endif
>>      if (!ctx->m.dct_quantize)
>>          ctx->m.dct_quantize = dct_quantize_c;
>> +    if (!ctx->get_pixels_4x8_sym)
>> +        ctx->get_pixels_4x8_sym = dnxhd_get_pixels_4x8;
>>  
>>      ctx->m.mb_height = (avctx->height + 15) / 16;
>>      ctx->m.mb_width  = (avctx->width  + 15) / 16;
> 
> am i missing something or could the if() be avoided by setting the pointer
> before dsputil_init/ff_dnxhd_init_mmx ?

Hum, right, interesting, it's just usually this way everywhere else.
Changed.

> also isnt it 8x4 instead of 4x8 (that being a seperate fix unrelated to this
> patch of course) i just noticed it ...

Well it's 4 rows of 8 pixels, I guess it depends on how you see it.

> [...]
>> Index: libavcodec/dnxhdenc.h
>> ===================================================================
>> --- libavcodec/dnxhdenc.h	(revision 16051)
>> +++ libavcodec/dnxhdenc.h	(working copy)
> 
>> @@ -81,6 +81,8 @@
>>  
>>      RCCMPEntry *mb_cmp;
>>      RCEntry   (*mb_rc)[8160];
>> +
>> +    void (*get_pixels_4x8_sym)(DCTELEM *, const uint8_t *, int);
>>  } DNXHDEncContext;
> 
> the required alignment of the arguments should be documented like in dsputil

Ok.

>> +static void get_pixels_4x8_sym_sse2(DCTELEM *block, const uint8_t *pixels, int line_size)
>> +{
>> +    __asm__ volatile(
>> +        "pxor %%xmm7,      %%xmm7       \n\t"
>> +        "movq (%0),        %%xmm0       \n\t"
>> +        "movq (%0, %2),    %%xmm1       \n\t"
>> +        "movq (%0, %2,2),  %%xmm2       \n\t"
>> +        "movq (%0, %3),    %%xmm3       \n\t"
>> +        "punpcklbw %%xmm7, %%xmm0       \n\t"
>> +        "punpcklbw %%xmm7, %%xmm1       \n\t"
>> +        "punpcklbw %%xmm7, %%xmm2       \n\t"
>> +        "punpcklbw %%xmm7, %%xmm3       \n\t"
>> +        "movdqa %%xmm0,      (%1)       \n\t"
>> +        "movdqa %%xmm1,    16(%1)       \n\t"
>> +        "movdqa %%xmm2,    32(%1)       \n\t"
>> +        "movdqa %%xmm3,    48(%1)       \n\t"
>> +        "movdqa %%xmm3 ,   64(%1)       \n\t"
>> +        "movdqa %%xmm2 ,   80(%1)       \n\t"
>> +        "movdqa %%xmm1 ,   96(%1)       \n\t"
>> +        "movdqa %%xmm0,   112(%1)       \n\t"
>> +        : "+r" (pixels)
>> +        : "r" (block), "r" ((x86_reg)line_size), "r" ((x86_reg)line_size*3)
> 
> the code is not changing pixels (it does change pixels[x] but thats
> something else) thus does not need "+/=r"
> 

Ok, thanks, changed.

Updated patch attached.

-- 
Baptiste COUDURIER                              GnuPG Key Id: 0x5C1ABAAA
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
checking for life_signs in -lkenny... no
-------------- next part --------------
A non-text attachment was scrubbed...
Name: get_pixels_4x8_sym_2.patch
Type: text/x-diff
Size: 5374 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081210/e925d2e3/attachment.patch>