[FFmpeg-devel] AMR-NB decoder
Michael Niedermayer
michaelni
Mon Aug 10 14:48:54 CEST 2009
On Mon, Aug 10, 2009 at 08:42:53AM +0100, Colin McQuillan wrote:
> 2009/8/9 M?ns Rullg?rd <mans at mansr.com>:
> > Colin McQuillan <m.niloc at googlemail.com> writes:
> >
> >> 2009/8/8 Michael Niedermayer <michaelni at gmx.at>:
> >>> On Sat, Aug 08, 2009 at 04:09:39PM +0100, Colin McQuillan wrote:
> >>>> 2009/8/8 Michael Niedermayer <michaelni at gmx.at>:
> >>>> > On Fri, Aug 07, 2009 at 08:23:53PM +0100, Colin McQuillan wrote:
> >>>> >> 2009/8/6 Michael Niedermayer <michaelni at gmx.at>:
> >>>> >> > On Wed, Aug 05, 2009 at 05:51:36PM +0100, Colin McQuillan wrote:
> >>>> >> >> Attached is a patch for an AMR-NB decoder.
> >>>>
> >>>> [...]
> >>>>
> >>>> >> > that should e a seperate patch
> >>>> >>
> >>>> >> I'll leave this one until I investigate a version for sparse vectors. Attached:
> >>>> >>
> >>>> >> 1. Helper functions for gain control in floating-point codecs
> >>>> >> I couldn't find a similar fixed point function to copy the function name.
> >>>> >>
> >>>> >> 2. Floating-point version of ff_acelp_high_pass_filter
> >>>> >
> >>>> >> ?acelp_vectors.c | ? 22 ++++++++++++++++++++++
> >>>> >> ?acelp_vectors.h | ? 27 +++++++++++++++++++++++++++
> >>>> >> ?2 files changed, 49 insertions(+)
> >>>> >> f1abbee9b62c1779fd5fb1c634d4ab4294d8611d ?get-set-energyf.patch
> >>>> >> Index: libavcodec/acelp_vectors.c
> >>>> >> ===================================================================
> >>>> >> --- libavcodec/acelp_vectors.c ? ? ? ?(revision 19606)
> >>>> >> +++ libavcodec/acelp_vectors.c ? ? ? ?(working copy)
> >>>> >> @@ -155,3 +155,25 @@
> >>>> >> ? ? ? ? ?out[i] = weight_coeff_a * in_a[i]
> >>>> >> ? ? ? ? ? ? ? ? + weight_coeff_b * in_b[i];
> >>>> >> ?}
> >>>> >> +
> >>>> >> +float ff_energyf(const float *v, int length)
> >>>> >> +{
> >>>> >> + ? ?float sum = 0;
> >>>> >> + ? ?int i;
> >>>> >> +
> >>>> >> + ? ?for (i = 0; i < length; i++)
> >>>> >> + ? ? ? ?sum += v[i] * v[i];
> >>>> >> +
> >>>> >> + ? ?return sum;
> >>>> >> +}
> >>>> >
> >>>> > ff_dot_productf)(
> >>>>
> >>>> Do you mean that ff_energyf is redundant? I've taken it out.
> >>>
> >>> hmm well, as you say it that way, ff_energyf() could be faster due to
> >>> fewer mem reads, if that is te case in practice it could be kept
> >>
> >> ff_energyf is reliably 4% faster in my test, so I'll add it back in.
> >
> > That function has high simdicity so it should be added to dsputil and
> > simdified.
>
> I'll try, but I didn't mean to imply that energy calculations are
> critical to performance. The slow parts of the AMR decoder are the IIR
> and FIR filters, which are already in celp_filters.c.
if the SIMD energy is not reaching an overall (whole codec) 0.1% speedup over
using a more generic SIMD dot product then its probably not worth it and
could be ommited
>
> Attached is "Implement vector energy calculation in dsputil".
> dsputil.c | 12 ++++++++++++
> dsputil.h | 2 ++
> x86/dsputil_mmx.c | 26 ++++++++++++++++++++++++++
> 3 files changed, 40 insertions(+)
> bfa9b2ddf24406c925efda2d4a58e3bb078e74fb vector_energyf.patch
> Index: libavcodec/x86/dsputil_mmx.c
> ===================================================================
> --- libavcodec/x86/dsputil_mmx.c (revision 19613)
> +++ libavcodec/x86/dsputil_mmx.c (working copy)
> @@ -2051,6 +2051,31 @@
> }
> }
>
> +static float vector_energyf_sse(const float *src, int len)
> +{
> + float result;
> + x86_reg i = (len - 4) * 4;
> + __asm__ volatile(
> + "xorps %%xmm2, %%xmm2 \n"
> + "1: \n"
> + "movaps (%2,%0), %%xmm0 \n"
> + "movaps %%xmm0, %%xmm1 \n"
> + "mulps %%xmm0, %%xmm1 \n"
> + "addps %%xmm1, %%xmm2 \n"
movaps (%2,%0), %%xmm0
mulps %%xmm0, %%xmm0
addps %%xmm0, %%xmm2
> + "sub $16, %0 \n"
> + "jge 1b \n"
> + "movlhps %%xmm2, %%xmm1 \n"
> + "addps %%xmm2, %%xmm1 \n"
> + "shufps $0xBB, %%xmm1, %%xmm2 \n"
> + "addps %%xmm1, %%xmm2 \n"
> + "movhlps %%xmm2, %%xmm2 \n"
> + "movss %%xmm2, %1 \n"
i wonder if thats the fastest way to do it ...
[...]
> Index: libavcodec/dsputil.h
> ===================================================================
> --- libavcodec/dsputil.h (revision 19613)
> +++ libavcodec/dsputil.h (working copy)
> @@ -387,6 +387,8 @@
> void (*ac3_downmix)(float (*samples)[256], float (*matrix)[2], int out_ch, int in_ch, int len);
> /* no alignment needed */
> void (*flac_compute_autocorr)(const int32_t *data, int len, int lag, double *autoc);
> + /* assume len is a multiple of 4, and arrays are 16-byte aligned */
> + float (*vector_energyf)(const float *src, int len);
alignment requirements are supposed to be written liks:
void ff_vp3_idct_put_c(uint8_t *dest/*align 8*/, int line_size, DCTELEM *block/*align 16*/);
also "/*" is not doxygen compatible and what the function does should be
more verbosely described, energy isnt a mathematically clear term, dot
product and sum or squares are.
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Everything should be made as simple as possible, but not simpler.
-- Albert Einstein
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090810/2e990c93/attachment.pgp>
More information about the ffmpeg-devel
mailing list