[FFmpeg-devel] [PATCH] AAC decoder
Ivan Kalvachev
ikalvachev
Wed May 28 14:36:33 CEST 2008
On 5/28/08, Robert Swain <robert.swain at gmail.com> wrote:
> 2008/5/26 Michael Niedermayer <michaelni at gmx.at>:
>> On Mon, May 26, 2008 at 12:42:37PM +0100, Robert Swain wrote:
>>> The tables are still different as intensity uses pow(0.5, (i-100)/4.)
>>> and the other cases use pow(2.0, (i-100)/4.).
>>
>> pow(0.5, (i-100)/4.) == pow(2.0, (100-i)/4.)
>>
>> and
>>
>> pow(2.0, (100-i)/4.) / 1024 == pow(2.0, (100-i)/4.-10) ==pow(2.0,
>> (100-i-40)/4.)
>>
>> possibly these allow the 2 tables to be merged, i mean
>>
>> pow2sf_tab[i] and intensity_tab[i]
>> to
>> pow2sf_tab[i+C] and pow2sf_tab[-i]
>
> OK, I've thought about this a bit more. I think either sf_scale should
> be 'applied' just before downmixing/float_to_int16 conversion as in
> ac3dec.c or sf_scale can effectively be merged into this table almost
> entirely thanks to being representable as a power of 2 in either the C
> or SIMD float_to_int16 case.
>
> From what I see there are 3 cases.
>
> - intensity table:
> pow(0.5, (i-100)/4) = pow(2, (100-i)/4)
> which would have indices [100-255, 100-0] = [-155, 100]
>
> - sf table when sf_scale is -1/1024:
> pow(2, (i-100)/4) * -pow(2, -10) = -pow(2, (i-140)/4)
> ignoring the sign issue, it would have indices [0-140, 255-140] = [-140,
> 115]
>
> - sf table when sf_scale is -1/(1024*32768):
> pow(2, (i-100)/4) * -pow(2, -25) = -pow(2, (i-200)/4)
> [0-200, 255-200] = [-200, 55]
>
> So, the range of indices into the table should be [-200, 115],
> sf_scale can be replaced by a constant integer offset into the table
> and we handle the signs with a little branching or something. Does
> that sound like a good idea? Any suggestions for alterations before I
> implement it?
I think the table is a little bigger.
We already talked on irc, I was told that this code is not speed
critical, but it still can get some optimizations.
There is ldexp(x,p) that does x*(2^p). It is supposed to be very fast
because it only changes the exponent and doesn't do full multiply.
If we use that x^(a+b) = (x^a)*(x^b), where i=a+4*b and x=2 .
We can do (2^(a/4))*(2^b) that turns into something like
ldexp( pow2sf_tab[i%4], i/4 );
The pow2sf_tab table should be in range [-3;3] .
This is around 15 times faster than pow,
it is probably slower than full table,
but it could be reasonable trade off ...
More information about the ffmpeg-devel
mailing list