[FFmpeg-devel] [PATCH] Heavy optimization of IFF decoder
Sebastian Vater
cdgs.basty
Tue Apr 27 20:47:41 CEST 2010
Ronald S. Bultje a ?crit :
> Hi,
>
> 2010/4/27 M?ns Rullg?rd <mans at mansr.com>:
>
>> "Ronald S. Bultje" <rsbultje at gmail.com> writes:
>>
>>> On Mon, Apr 26, 2010 at 7:39 PM, Sebastian Vater
>>> <cdgs.basty at googlemail.com> wrote:
>>>
>>>> + const uint32_t lut[] = {0x0000000,
>>>> + 0x1000000 << plane,
>>>> + 0x0010000 << plane,
>>>> + 0x1010000 << plane,
>>>> + 0x0000100 << plane,
>>>> + 0x1000100 << plane,
>>>> + 0x0010100 << plane,
>>>> + 0x1010100 << plane,
>>>> + 0x0000001 << plane,
>>>> + 0x1000001 << plane,
>>>> + 0x0010001 << plane,
>>>> + 0x1010001 << plane,
>>>> + 0x0000101 << plane,
>>>> + 0x1000101 << plane,
>>>> + 0x0010101 << plane,
>>>> + 0x1010101 << plane};
>>>>
>>> I really can't imagine that a static const lut[][] isn't faster. which
>>> file did you use to test this? (Is it on mphq/samples?)
>>>
>> A static table whose values are shifted in the loop is 7% faster on ARM.
>>
>
> It's a little slower on x86 (~12%). However, a (static) 2D array is
> faster (3%) over the original patch. Mans just said that's fine on ARM
> as well, so you should probably implement that (don't forget that
> plane is const, so do a const *lut = table[plane] before entering the
> loop, else gcc messes up).
>
Please clarify on this, what kind of static 2D array should I use now?
Just apply this one? Well, this one is 20% slower (increases from 6.2k
to 8.9k) for me than my original.
--
Best regards,
:-) Basty/CDGS (-:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: iff-opt-dp8-static.patch
Type: text/x-patch
Size: 2663 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100427/28ec91eb/attachment.bin>
More information about the ffmpeg-devel
mailing list