[FFmpeg-devel] [PATCH] use MUL64 in ac3dec.c

Tue Jan 12 18:12:42 CET 2010

On 12.01.2010 14:00, Justin Ruggles wrote:
> Reimar D?ffinger wrote:
> 
>> On Mon, Jan 11, 2010 at 11:11:43PM +0000, M?ns Rullg?rd wrote:
>>> Reimar D?ffinger <Reimar.Doeffinger at gmx.de> writes:
>>>> Even faster (though possibly wrong, even though I hear no issues with
>>>> my samples) should be the variant with MULH, though I could not really
>>>> measure a difference:
>>> Make sure the output is actually identical.
>>>
>>>> Index: libavcodec/ac3dec.c
>>>> ===================================================================
>>>> --- libavcodec/ac3dec.c	(revision 21153)
>>>> +++ libavcodec/ac3dec.c	(working copy)
>>>> @@ -420,10 +420,9 @@
>>>>          int band_end = bin + s->cpl_band_sizes[band];
>>>>          for (ch = 1; ch <= s->fbw_channels; ch++) {
>>>>              if (s->channel_in_cpl[ch]) {
>>>> -                int64_t cpl_coord = s->cpl_coords[ch][band];
>>>> +                int cpl_coord = s->cpl_coords[ch][band] << 9;
>>> Is this certain not to overflow?
>> No idea, I just wanted to mention it. For me it is not any faster so I don't

For me, it /was/ very little faster (still not as fast as mplayer's internal
liba52, but...), so I looked slightly further...

>> intend to investigate that variant further - it was just in case someone here
>> knows the limits for these variables right out of their head.
> 
> 
> The shift can overflow.  The maximum value for cpl_coords is 31<<21.
> 
>>>>                  for (bin = band_start; bin < band_end; bin++) {
>>>> -                    s->fixed_coeffs[ch][bin] = ((int64_t)s->fixed_coeffs[CPL_CH][bin] *
>>>> -                                                cpl_coord) >> 23;
>>>> +                    s->fixed_coeffs[ch][bin] = MULH(s->fixed_coeffs[CPL_CH][bin], cpl_coord);
>>>>                  }
>>> Provided the shift above is safe, that had better work, since the
>>> destination is 32-bit, or the original would be suffering some serious
>>> truncation problems.
>> And if fixed_coeffs can end up being any 32 bit value, that in turn would
>> limit cpl_coord enough to make above shift safe. However it is all speculation.
> 
> fixed_coeffs original value is 24-bit signed, but the final value can be
So
-                int64_t cpl_coord = s->cpl_coords[ch][band];
+                int cpl_coord = s->cpl_coords[ch][band] << 4;
[...]
-                    s->fixed_coeffs[ch][bin] =
((int64_t)s->fixed_coeffs[CPL_CH][bin] *
-                                                cpl_coord) >> 23;
+                    s->fixed_coeffs[ch][bin] =
MULH(s->fixed_coeffs[CPL_CH][bin]<<5, cpl_coord);
is safe?
On amd x2 4850e/32bit/debian etch/gcc 4.1.2/-march=pentium3 -mtune=k8, mplayer's
audio-only test shows (best of few runs):

6ch ac3 track inside mkv, 3600s
before: 0m25.178s md5:8bdaac4f9c119d31a4e016c3a9534622
reimar: 0m25.058s md5:same
my:     0m24.846s md5:same
liba52: 0m24.066s md5:177bf5333a878dd8d14e5a48209dc262

2ch track:
before: 0m10.145s md5:260b1222e7d9e863f8b668e8abc22d25
reimar: 0m10.089s md5:same
my:     0m10.029s md5:same
liba52:  0m9.257s md5:37a9b734c14264d1aca0e3330d5a9871

(I did not tested on other files/other cpus/other gcc versions)

It is also may be more noticeable on arches that have MULH, but does not have
MUL64 (according to grep MUL */mathops.h - bfin and ppc; don't have/cannot
test/don't really know anything about).

> as large as cpl_coords.  I had honestly never looked at the maximum
> value of this before, and it may have other implications in the code
> that I will need to investigate.
> 
> Your first patch looks fine though.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ffac3-speedup-2.patch
Type: text/x-diff
Size: 1019 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100112/98837d8f/attachment.patch>