[FFmpeg-devel] [PATCH] g722 decoder, no licensing fame

Sat Apr 4 22:46:30 CEST 2009

On Sat, Apr 04, 2009 at 12:39:46PM -0700, Kenan Gillet wrote:
>
> On Apr 4, 2009, at 12:06 PM, Kenan Gillet wrote:
>
>> Hi
>> On Apr 4, 2009, at 11:46 AM, Kenan Gillet wrote:
>>>
>>> On Apr 3, 2009, at 9:42 PM, Michael Niedermayer wrote:
>>>
>>>> On Tue, Mar 31, 2009 at 11:34:34PM -0700, Kenan Gillet wrote:
>>
>> [...]
>>>>> +/**
>>>>> + * adaptive predictor
>>>>> + *
>>>>> + * @note On x86 using the MULL macro in a loop is slower than not 
>>>>> using the macro.
>>>>> + */
>>>>> +static void do_adaptive_prediction(struct G722Band *band, const int 
>>>>> cur_diff)
>>>>> +{
>>>>> +    int sg[2], limit, i, cur_part_reconst;
>>>>> +
>>>>> +    band->qtzd_reconst_mem[1] = band->qtzd_reconst_mem[0];
>>>>> +    band->qtzd_reconst_mem[0] = av_clip_int16((band->s_predictor + 
>>>>> cur_diff) << 1);
>>>>> +
>>>>> +    cur_part_reconst = band->s_zero + cur_diff < 0;
>>>>> +
>>>>> +    sg[0] = sign_lookup[cur_part_reconst != 
>>>>> band->part_reconst_mem[0]];
>>>>> +    sg[1] = sign_lookup[cur_part_reconst == 
>>>>> band->part_reconst_mem[1]];
>>>>
>>>> i dont see why a LUT should be used here, its not really more readable
>>>> and i doubt its faster.
>>>
>>> it is faster
>>>
>>> on Core 2 Duo 2Ghz , gcc 4.2.1
>>> LUT based:
>>> testing for 64kb 16KHz: [OK][ 1574 dezicycles in do_adaptive_prediction, 
>>> 262130 runs, 14 skips ]
>>> testing for 56kb 16KHz: [OK][ 1668 dezicycles in do_adaptive_prediction, 
>>> 262110 runs, 34 skips ]
>>> testing for 48kb 16KHz: [OK][ 1607 dezicycles in do_adaptive_prediction, 
>>> 262112 runs, 32 skips ]
>>> testing for 64Kb  8KHz: [OK][ 1584 dezicycles in do_adaptive_prediction, 
>>> 131066 runs, 6 skips ]
>>> testing encoding for 64Kb  16KHz: [OK][ 1558 dezicycles in 
>>> do_adaptive_prediction, 262108 runs, 36 skips ]
>>>
>>>
>>> non-LUT:
>>> testing for 64kb 16KHz: [OK][ 1686 dezicycles in do_adaptive_prediction, 
>>> 262126 runs, 18 skips ]
>>> testing for 56kb 16KHz: [OK][ 1719 dezicycles in do_adaptive_prediction, 
>>> 262107 runs, 37 skips ]
>>> testing for 48kb 16KHz: [OK][ 1689 dezicycles in do_adaptive_prediction, 
>>> 262126 runs, 18 skips ]
>>> testing for 64Kb  8KHz: [OK][ 1676 dezicycles in do_adaptive_prediction, 
>>> 131065 runs, 7 skips ]
>>> testing encoding for 64Kb  16KHz: [OK][ 1673 dezicycles in 
>>> do_adaptive_prediction, 262116 runs, 28 skips ]
>>>
>>> i can revert to non-LUT if you prefer
>>
>> sorry for the noise, it though the comment was about using the sg array 
>> not just the lookup table.
>> remove the use of the lookup table locally
>
> I knew I should not have answered before benchmarking,
> conter intuitive but the LUT is still faster

what code did you use for the non LUT variant?

1 | - (a==b)

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090404/e4360695/attachment.pgp>