[FFmpeg-devel] [PATCH] AAC: unroll parts of decode_spectrum_and_dequant()
Jason Garrett-Glaser
darkshikari
Tue Dec 9 05:04:10 CET 2008
On Mon, Dec 8, 2008 at 7:58 PM, Jason Garrett-Glaser
<darkshikari at gmail.com> wrote:
> On Mon, Dec 8, 2008 at 7:34 PM, Alex Converse <alex.converse at gmail.com> wrote:
>> On Mon, Dec 8, 2008 at 9:33 PM, Jason Garrett-Glaser
>> <darkshikari at gmail.com>wrote:
>>
>>> On Mon, Dec 8, 2008 at 3:43 PM, Alex Converse <alex.converse at gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > The attached patch unrolling sections of decode spectrum saves me 5.48%
>>> on
>>> > my mpeg4-lc-256kbps stream on my core2 duo.
>>> >
>>> > Regards,
>>> > Alex Converse
>>>
>>> If dim can only be 2 or 4, wouldn't it be better to do
>>>
>>> if( dim == 4 ) {
>>> do dim 4 stuff
>>> }
>>> do dim 2 stuff
>>>
>>> The switch seems unnecessary.
>>>
>>
>> Idiomatically I like the switch better but your way is faster. When I did
>> that I also tried reverting access back to forward order and got a slight
>> speed up. This way made the unsigned loop just like the other three, so I
>> added that one for another benchmarked verified speed up.
>>
>> The net gain is a 12% decrease in cycles over the original vs 5% before.
>
> if (vq_ptr[2]) coef[coef_tmp_idx + 2] = 1 - 2*(int)get_bits1(gb);
> if (vq_ptr[3]) coef[coef_tmp_idx + 3] = 1 - 2*(int)get_bits1(gb);
>
> Isn't that a rather unnecessary int -> float conversion? I'd think
> you could do much better than that considering there are only two
> possible input values...
>
> Dark Shikari
>
Simple proposal for the above:
static const float lookup[2] = {1.0, -1.0};
if (vq_ptr[2]) coef[coef_tmp_idx + 2] = lookup[get_bits1(gb)];
Dark Shikari
More information about the ffmpeg-devel
mailing list