[FFmpeg-devel] [PATCH 2/3] avcodec/aacsbr: Add comment about possibly optimization in sbr_dequant()

Sat Dec 12 19:08:01 CET 2015

On Sat, Dec 12, 2015 at 12:58 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Fri, Dec 11, 2015 at 12:09:57PM -0500, Ganesh Ajjanagadde wrote:
>> On Fri, Dec 11, 2015 at 11:36 AM, Andreas Cadhalpun
>> <andreas.cadhalpun at googlemail.com> wrote:
>> > On 11.12.2015 17:21, Ganesh Ajjanagadde wrote:
>> >> On Fri, Dec 11, 2015 at 11:16 AM, Andreas Cadhalpun
>> >> <andreas.cadhalpun at googlemail.com> wrote:
>> >>> On 19.11.2015 14:17, Michael Niedermayer wrote:
>> >>>> From: Michael Niedermayer <michael at niedermayer.cc>
>> >>>>
>> >>>> Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
>> >>>> ---
>> >>>>  libavcodec/aacsbr.c |    1 +
>> >>>>  1 file changed, 1 insertion(+)
>> >>>>
>> >>>> diff --git a/libavcodec/aacsbr.c b/libavcodec/aacsbr.c
>> >>>> index d1e3a91..e014646 100644
>> >>>> --- a/libavcodec/aacsbr.c
>> >>>> +++ b/libavcodec/aacsbr.c
>> >>>> @@ -73,6 +73,7 @@ static void sbr_dequant(SpectralBandReplication *sbr, int id_aac)
>> >>>>  {
>> >>>>      int k, e;
>> >>>>      int ch;
>> >>>> +    //TODO: Replace exp2f(0.5*x) by a LUT, the inputs are all integer and have a small range
>> >>>>
>> >>>>      if (id_aac == TYPE_CPE && sbr->bs_coupling) {
>> >>>>          float alpha      = sbr->data[0].bs_amp_res ?  1.0f :  0.5f;
>> >>>>
>> >>>
>> >>> This shouldn't hurt, with or without the clarification requested by Ganesh.
>> >>
>> >> I am doing related work cleaning up and optimizing usages of slow libm
>> >> functions such as pow and exp2. Do you know the exact possible range
>> >> of the inputs x, and if so, can it be added to the comment? That will
>> >> be very helpful for me to come up with a patch. Thanks.
>> >
>> > The exp2f expressions are:
>> > exp2f(sbr->data[0].env_facs_q[e][k] * alpha + 7.0f);
>> > exp2f((pan_offset - sbr->data[1].env_facs_q[e][k]) * alpha);
>> > exp2f(NOISE_FLOOR_OFFSET - sbr->data[0].noise_facs_q[e][k] + 1);
>> > exp2f(12 - sbr->data[1].noise_facs_q[e][k]);
>> > exp2f(alpha * sbr->data[ch].env_facs_q[e][k] + 6.0f);
>> > exp2f(NOISE_FLOOR_OFFSET - sbr->data[ch].noise_facs_q[e][k]);
>> >
>> > Here alpha is 1 or 0.5, pan_offset 12 or 24 and NOISE_FLOOR_OFFSET is 6.
>> > After patch 3 of this series, env_facs_q is in the range from 0 to 127 and
>> > noise_facs_q is already limited to the range from 0 to 30.
>> >
>> > So x should always be in the range -300..300, or so.
>>
>> Very good, thanks a lot.
>>
>> Based on the above range, my idea is to not even use a LUT, but use
>> something like exp2fi followed by multiplication by M_SQRT2 depending
>> on even or odd.
>
> conditional operations can due to branch misprediction be potentially
> rather slow

I think it will still be far faster than exp2f, and in the absence of
hard numbers, I view this a far better approach than a large (~300
element) lut. Of course, the proof and extent of this will need to
wait for actual benches.

>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Many that live deserve death. And some that die deserve life. Can you give
> it to them? Then do not be too eager to deal out death in judgement. For
> even the very wise cannot see all ends. -- Gandalf
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>