[FFmpeg-devel] [PATCH] ALS: Solve Issue 1657
Thilo Borgmann
thilo.borgmann
Thu Jan 7 01:19:57 CET 2010
Am 06.01.10 20:46, schrieb Michael Niedermayer:
> On Wed, Jan 06, 2010 at 01:21:10AM +0100, Thilo Borgmann wrote:
>> Am 05.01.10 23:38, schrieb Michael Niedermayer:
>>> On Tue, Jan 05, 2010 at 02:58:56PM +0100, Thilo Borgmann wrote:
>>>> Am 05.01.10 12:11, schrieb Michael Niedermayer:
>>>>> On Tue, Jan 05, 2010 at 03:59:41AM +0100, Thilo Borgmann wrote:
>>>>>> Am 05.01.10 02:43, schrieb Michael Niedermayer:
>>>>>>> On Tue, Jan 05, 2010 at 02:23:09AM +0100, Thilo Borgmann wrote:
>>>>>>>> Am 05.01.10 01:43, schrieb Michael Niedermayer:
>>>>>>>>> On Tue, Jan 05, 2010 at 12:34:53AM +0100, Thilo Borgmann wrote:
>>>>>>>>>> Am 05.01.10 00:30, schrieb Thilo Borgmann:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> issue 1657 seems to be caused by negative indices used in [].
>>>>>>>>>>> See: http://roundup.ffmpeg.org/roundup/ffmpeg/issue1657
>>>>>>>>>>>
>>>>>>>>>>> Using *() resolves this issue.
>>>>>>>>>>>
>>>>>>>>>>> Tested with gcc 4.0 on MacOS 10.6. There were other versions/compilers
>>>>>>>>>>> mentioned in roundup, maybe these could be tested by someone (you)?
>>>>>>>>>>>
>>>>>>>>>>> I'm sorry, my svn still seems to be broken and produces unusable patches
>>>>>>>>>>> (%ld...). Nevertheless I can apply them if the workaround is ok.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Some artifacts left in als_data.h. Ignore the old patch, updated patch
>>>>>>>>>> attached.
>>>>>>>>>>
>>>>>>>>>> -Thilo
>>>>>>>>>
>>>>>>>>>> alsdec.c | 4 ++--
>>>>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>>>>> 987821d84540420efa6f2e67be17094074e638f8 als_issue1657.rev1.patch
>>>>>>>>>> Index: libavcodec/alsdec.c
>>>>>>>>>> ===================================================================
>>>>>>>>>> --- libavcodec/alsdec.c (Revision 21025)
>>>>>>>>>> +++ libavcodec/alsdec.c (Arbeitskopie)
>>>>>>>>>> @@ -%ld,%ld +%ld,%ld @@
>>>>>>>>>> y = 1 << 19;
>>>>>>>>>>
>>>>>>>>>> for (sb = 0; sb < smp; sb++)
>>>>>>>>>> - y += MUL64(lpc_cof[sb],raw_samples[smp - (sb + 1)]);
>>>>>>>>>> + y += MUL64(lpc_cof[sb], *(raw_samples + smp - (sb + 1)));
>>>>>>>>>
>>>>>>>>> patch ok
>>>>>>>>
>>>>>>>> Applied.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> independant of this, it could be that if lpc_cof was reversed
>>>>>>>>>
>>>>>>>>> for (sb = 0; sb < smp; sb++)
>>>>>>>>> y += MUL64(lpc_cof[sb], raw_samples[sb]);
>>>>>>>>
>>>>>>>> In the second case the index has to be negative which is not possible
>>>>>>>> with this approach.
>>>>>>>
>>>>>>> i dont see why
>>>>>>> also it should be
>>>>>>> raw_samples[sb]
>>>>>>> and
>>>>>>> raw_samples++
>>>>>>
>>>>>> Hmm. If you could please elaborate a bit more, I will pick this up
>>>>>> tomorrow and will see if I can make it better.
>>>>>
>>>>> one way:
>>>>>
>>>>> - for (; smp < bd->block_length; smp++) {
>>>>> - y = 1 << 19;
>>>>> -
>>>>> - for (sb = 0; sb < opt_order; sb++)
>>>>> - y += MUL64(bd->lpc_cof[sb],raw_samples[smp - (sb + 1)]);
>>>>> -
>>>>> - raw_samples[smp] -= y >> 20;
>>>>> + for (raw_samples+=smp; raw_samples < raw_samples_end; raw_samples++) {
>>>>> + y = 1 << 19;
>>>>> +
>>>>> + for (sb= -opt_order; sb<0; sb++)
>>>>> + y += MUL64(lpc_cof[sb],raw_samples[sb]);
>>>>> +
>>>>> + raw_samples[0] -= y >> 20;
>>>>> }
>>>>
>>>> Ok I see, slightly different patch attached that has a ~20% speed gain
>>>> in the second loop alone. Also issue 1657 is avoided by not using smp in
>>>> substraction (and yes, tested with 64-bit & gcc 4.0).
>>>>
>>>
>>>> To have lpc_cof[sb] as well as raw_samples[sb] without math in the
>>>> indices, lpc_cof has to be reverted and its pointer shifted to one next
>>>> to the end (I think...) - this would make it too obfuscated IMO.
>>>
>>> could you try? it would be interresting to see how much faster it is
>>
>> Seems not to be faster. Also, reversed order of lpc_cof would shift this
>> math from these loops into others and reversion has also to be done
>> somewhere.
>>
>> 17446 dezicycles in reversed array, 2042 runs, 6 skips
>> 18731 dezicycles in reversed array, 2038 runs, 10 skips
>> 17446 dezicycles in reversed array, 2042 runs, 6 skips
>>
>> 17651 dezicycles in simple math, 2040 runs, 8 skips
>> 18497 dezicycles in simple math, 2039 runs, 9 skips
>> 17696 dezicycles in simple math, 2038 runs, 10 skips
>
> where can i find an ALS file for trying myself?
http://samples.mplayerhq.hu/fate-suite/lossless-audio/inside-als.mp4
is ready for download, if you would like to test with the very same
file, I could also provide that somehow.
>
>
>>
>>
>>
>> Code snippet:
>>
>>> // reverse linear prediction coefficients for efficiency
>>> for (k = 0; k < opt_order; k++)
>>> lpc_cof_reversed[k] = lpc_cof[opt_order - (k + 1)];
>>>
>>> START_TIMER;
>>> // reconstruct raw samples
>>> raw_samples = bd->raw_samples + smp;
>>> lpc_cof = lpc_cof_reversed + opt_order;
>>>
>>> for (; smp < bd->block_length; smp++) {
>>> y = 1 << 19;
>>>
>>> // simple:
>>> // for (sb = 0; sb < opt_order; sb++)
>>> // y += MUL64(bd->lpc_cof[sb], raw_samples[-(sb + 1)]);
>>> // reversed:
>>> for (sb = -opt_order; sb < 0; sb++)
>>> y += MUL64(lpc_cof[sb], raw_samples[sb]);
>>>
>>> *raw_samples++ -= y >> 20;
>>> }
>
> checking the end via raw_samples < end should be faster
> (1 variable less, 1 ++ less)
> also if this is on x86_64 try to make sb long instead of int this
> too might help
Yes and yes:
13869 dezicycles in reversed array & end, 2039 runs, 9 skips
13933 dezicycles in reversed array & end, 2040 runs, 8 skips
14816 dezicycles in reversed array & end, 2039 runs, 9 skips
11919 dezicycles in reversed array & end & long sb, 2040 runs, 8 skips
11947 dezicycles in reversed array & end & long sb, 2039 runs, 9 skips
12038 dezicycles in reversed array & end & long sb, 2040 runs, 8 skips
If this long sb shall make it into a patch, how to check for 64-bit,
e.g. using int or long, the ffmpeg-way?
-Thilo
More information about the ffmpeg-devel
mailing list