[FFmpeg-devel] [PATCH] improve yuv422p to rgb in libswscale

Michael Niedermayer michaelni
Wed Dec 1 04:13:57 CET 2010


On Tue, Nov 30, 2010 at 07:05:05PM -0800, Baptiste Coudurier wrote:
> On 11/30/2010 06:53 PM, Michael Niedermayer wrote:
>> On Tue, Nov 30, 2010 at 06:29:32PM -0800, Baptiste Coudurier wrote:
>>> On 11/30/2010 06:15 PM, Michael Niedermayer wrote:
>>>> On Tue, Nov 30, 2010 at 05:20:42PM -0800, Baptiste Coudurier wrote:
>>>>> On 11/30/2010 05:13 PM, Michael Niedermayer wrote:
>>>>>> On Tue, Nov 30, 2010 at 04:07:20AM -0800, Baptiste Coudurier wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> $subject, use full vertical data when convert 422p, improve quality a lot.
>>>>>>>
>>>>>>> --
>>>>>>> Baptiste COUDURIER
>>>>>>> Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
>>>>>>> FFmpeg maintainer                                  http://www.ffmpeg.org
>>>>>>
>>>>>>>     x86/yuv2rgb_template.c |   25 ++++-------
>>>>>>>     yuv2rgb.c              |  109 ++++++-------------------------------------------
>>>>>>>     2 files changed, 26 insertions(+), 108 deletions(-)
>>>>>>> 16f384c9b114c76572a539511c42267ce2942c67  yuv422p_to_rgb.patch
>>>>>>
>>>>>> this looks like it would make 420p->rgb quite a bit slower.
>>>>>
>>>>> Do you think changing>>1 to>>vshift would make is quite a bit slower ?
>>>>
>>>> no but that stuff:
>>>> @@ -152,134 +144,102 @@
>>>>    YUV2RGBFUNC(yuv2rgb_c_48, uint8_t, 0)
>>>>        LOADCHROMA(0);
>>>>        PUTRGB48(dst_1,py_1,0);
>>>> -    PUTRGB48(dst_2,py_2,0);
>>>>
>>>>        LOADCHROMA(1);
>>>> -    PUTRGB48(dst_2,py_2,1);
>>>>        PUTRGB48(dst_1,py_1,1);
>>>>
>>>>        LOADCHROMA(2);
>>>>        PUTRGB48(dst_1,py_1,2);
>>>> -    PUTRGB48(dst_2,py_2,2);
>>>>
>>>>        LOADCHROMA(3);
>>>> -    PUTRGB48(dst_2,py_2,3);
>>>>        PUTRGB48(dst_1,py_1,3);
>>>>    ENDYUV2RGBLINE(48)
>>>>        LOADCHROMA(0);
>>>>        PUTRGB48(dst_1,py_1,0);
>>>> -    PUTRGB48(dst_2,py_2,0);
>>>>
>>>>        LOADCHROMA(1);
>>>> -    PUTRGB48(dst_2,py_2,1);
>>>>        PUTRGB48(dst_1,py_1,1);
>>>>    ENDYUV2RGBFUNC()
>>>>
>>>> and then running the code twice
>>>
>>> This is the C code, the mmx routine for yuv420 to rgb24 is already present.
>>>
>>> I don't understand running the code twice, can you please clarify ?
>>
>> yes, iam seeing this:
>> -    for (y=0; y<srcSliceH; y+=2) {\
>> +    for (y=0; y<srcSliceH; y++) {\
>>
>> and thus iam thinking it runs the code twice after the patch,
>> do i miss something?
>
> The old code was processing 2 lines at once, see py_1 and py_2, that's  
> why a lot of code is removed in the macros.

yes but i think its slower after the patch


>
> Do you agree that the mmx change looks trivial ?

yes mmx looks good

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101201/d1373f57/attachment.pgp>



More information about the ffmpeg-devel mailing list