[Ffmpeg-devel] mpegaudiodec.c and armv5te optimizations
Michael Niedermayer
michaelni
Wed Oct 4 10:05:46 CEST 2006
Hi
On Wed, Oct 04, 2006 at 01:47:23AM +0300, Siarhei Siamashka wrote:
> On Tuesday 03 October 2006 23:34, Michael Niedermayer wrote:
>
> > > I would like to ask those who are familiar with mp3 decoding algorithm
> > > in mpegaudiodec.c better if there could be any really nasty things
> > > happening after changing current
> > >
> > > #define MULH(a,b) (((int64_t)(a) * (int64_t)(b))>>32)
> > > #define FIXHR(a) ((int)((a) * (1LL<<32) + 0.5))
> > >
> > > to something like
> > >
> > > #define MULH(a,b) (((int64_t)(a) * (int16_t)(b))>>16)
> > > #define FIXHR(a) ((int16_t)((a) * (1LL<<16) + 0.5))
> > >
> > > in low quality decoding mode.
> > >
> > > I tried to decode a few mp3 files and the difference does not seem to be
> > > very noticeable (samples seem to differ +-4 at most).
> >
> > i think the change should be ok (for ARM) for x86 it should be slower
>
> Sure, I just wanted to know if reduction of precision of these constants from
> 32-bit to 16-bit could have any other negative effect. And this optimization
> can really only be used for processors that have a special instruction for
> that operation.
>
> Anyway, here is a simple patch attached.
>
> Tested with the latest mplayer SVN (with some tweaks to get it compiled with
> HAVE_ARMV5TE defined). Configured using:
> CFLAGS="-O4 -pipe -ffast-math -fomit-frame-pointer -mcpu=arm926ej-s -DHAVE_ARMV5TE" ./configure --disable-libavcodec_mpegaudio_hp
>
> Results of decoding mp3 file to /dev/null:
> ffmp3 (current): 58.7 seconds
> ffmp3 (patched ): 56.6 seconds
> libmad: 46.2 seconds
>
> Effect is minimal and quite disappointing. We gain very little,
17% of the difference between libmad and ffmp3 isnt that small
if you do another 5 such optimizations we would beat libmad
> but lose some
> precision.
yes, but its not much, worst case +-4 difference
btw, could you test by how much the high-low quality difference changes with
this optimization (mean squared error and max difference), if it doesnt
double the error then iam in favor of applying this patch
> But it is understandable, compiler can't load and pack two 16-bit
> constants in a register, also it does not take into account 1 clock penalty
> if the result of multiplication is used immediately in the next instruction.
> So for any really useful results, fully assembler optimized code is required.
or writing a better compiler :)
theres nothing which prevents the compiler from doing these optimization short
of the incompetence of the developers who wrote the compiler
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is
More information about the ffmpeg-devel
mailing list