[FFmpeg-devel] [PATCH] Common ACELP code & G.729 [3/7] - vectors operations
Michael Niedermayer
michaelni
Sat May 3 15:35:32 CEST 2008
On Sat, May 03, 2008 at 02:53:40PM +0700, Vladimir Voroshilov wrote:
> Michael Niedermayer wrote:
> > On Fri, May 02, 2008 at 06:43:18PM +0700, Vladimir Voroshilov wrote:
> > > This patch contains routines for operating with adaptive and algebraic vectors.
> > >
> > > FIR filter was moved here.
> > > It now does not use any "frac=-frac" or "if(frac<9)" and can be called
> > > as foo(delay/6, delay%6).
> > > This possibility was not obvious for me due to loop asymmetry,
> > > introduced in reference code (and in specification too)
> > > to avoid "out of bounds" error.
> >
> > [...]
> > > +static const uint8_t mul_5x[8] = {0, 5, 10, 15, 20, 25, 30, 35};
> >
> > *5 should do :)
>
> This was optimization attempt.
> I'll remove all such tables.
> If anyone want to optimize code,
> benchmark it, fill free to do this after commit.
>
> [...]
>
> > > +static uint8_t gray_decode[32] =
> > > +{
> > > + 0, 1, 3, 2, 7, 6, 4, 5,
> > > + 15, 14, 12, 13, 8, 9, 11, 10,
> > > + 31, 30, 28, 29, 24, 25, 27, 26,
> > > + 16, 17, 19, 18, 23, 22, 20, 21
> > > +};
> >
> > this is just x^(x>>1)
>
> Wrong.
sorry, i mixed encoding and decoding up
>
> 4^(4>>1) = 4^2 = 6 != 7 = gray_decode[4]
> 5^(5>>1) = 5^2 = 7 != 6 = gray_decode[5]
> and so on...
>
> Perhaps it was intended to be so till G.729 guys touched it :)
> After seeing fc_2pulses_9bits_track2 table i'm not surpised.
>
>
> > of course only replace the table with above if its smaller/faster, same
> > applies to the other suggested replacements for the tables
> >
> >
> > [...]
> > > +void ff_acelp_fc_4pulses_13bits(
> > > + int16_t* fc_v,
> > > + int fc_index,
> > > + int pulses_signs)
> > > +{
> > > + int i;
> > > + int index;
> > > +
> > > + for(i=0; i<3; i++)
> > > + {
> > > + index = i + mul_5x[fc_index & 7];
> > > + fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192; // +/-1 in (2.13)
> > > +
> > > + fc_index >>= 3;
> > > + pulses_signs >>= 1;
> > > + }
> > > +
> > > + index = 3 + (fc_index & 1) + mul_5x[(fc_index>>1) & 7];
> > > + fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192;
> > > +}
> > > +
> >
>
> [...]
>
> > > +void ff_acelp_fc_4pulses_21bits(
> > > + int16_t* fc_v,
> > > + int fc_index,
> > > + int pulses_signs)
> > > +{
> > > + int i;
> > > + int index;
> > > +
> > > + for(i=0; i<3; i++)
> > > + {
> > > + index = i + 5 *(fc_index & 15);
> > > + fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192; // +/-1 in (2.13)
> > > +
> > > + fc_index >>= 4;
> > > + pulses_signs >>= 1;
> > > + }
> > > +
> > > + index = 3 + (fc_index & 1);
> > > + fc_index >>= 1;
> > > + index += 5 * (fc_index & 15);
> > > + fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192;
> > > +}
> > > +
> >
> > duplicate of ff_acelp_fc_4pulses_13bits()
>
> Added static routine.
> I want to keep those routines because i'm using pointers to them in
> formats array.
>
> > also the
> > fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192;
> > stuff is duplicated all over the place
>
> This line adds pulses to fixed vector.
> Lines are the same just because sign bits are packed together each other (one per
> index) and separately from index bits.
> There are also different packing methods (sipr @16k, amr @10.2k)
> I don't think implementing this 1 line of code via macro or inline
> routine is a good idea.
no, i meant
for(i=0; i<2 or 4; i++)
fc_v[ index[i] ] += ((pulses_signs>>i) & 1) ? 8191 : -8192;
But i do not know yet if this is a good idea, it might turn out that its
worse after some other simplifications are done ...
[...]
> +void ff_acelp_interpolate_pitch_vector(
> + int16_t* ac_v,
> + int pitch_delay_int,
> + int pitch_delay_frac,
> + int subframe_size)
> +{
> + int n, i;
> + int v;
> +
> + /* pitch_delay_frac [0; 5]
> + pitch_delay_int [PITCH_LAG_MIN; PITCH_LAG_MAX] */
> + for(n=0; n<subframe_size; n++)
> + {
> + /* 3.7.1 of G.729, Equation 40 */
> + v = ac_v[n - pitch_delay_int] * ff_acelp_interp_filter[0][FFABS(pitch_delay_frac-2)];
> + for(i=1; i<11; i++)
> + {
> +
> + /* Reference G.729 and AMR fixed point code performs clipping after
> + each of the two following accumulations.
> + Since clipping affects only synthetic OVERFLOW test and still not
> + cause int type oveflow, it was moved outside loop. */
> +
> + /* R(x):=ac_v[-k+x] */
> + /* v += R(n-i)*ff_acelp_interp_filter(t+6i) */
> + v += ac_v[n - pitch_delay_int - i] * ff_acelp_interp_filter[i][2-pitch_delay_frac];
> + /* v += R(n+i+1)*ff_acelp_interp_filter(6-t+6i) */
> + v += ac_v[n - pitch_delay_int + i] * ff_acelp_interp_filter[i][pitch_delay_frac-2];
Like with the other patches id like to have the comments seperated somehow
the code is quite hard to read with them intermingled like this.
> + }
> + ac_v[n] = av_clip_int16((v + 0x4000) >> 15);
> + }
> +}
> +
> +/**
> + * \brief common routines for 4pulses_13bits and 4pulses_21bits
> + *
> + * Tables differs only by width, so can be handled via common routine.
> + */
> +static void ff_acelp_4pulses_13_21_bits(
> + int16_t* fc_v,
> + int fc_index,
> + int pulses_signs,
> + int bits)
> +{
> + int mask = (1 << bits) - 1;
> + int i, index;
> +
> + for(i=0; i<3; i++)
> + {
> + index = i + 5 * (fc_index & mask);
> + fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192; // +/-1 in (2.13)
> +
> + fc_index >>= bits;
> + pulses_signs >>= 1;
> + }
> +
> + index = 3 + (fc_index & 1) + 5 * ((fc_index>>1) & mask);
> + fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192;
> +}
> +
[...]
> +
> +void ff_acelp_fc_2pulses_9bits_g729d(
> + int16_t* fc_v,
> + int fc_index,
> + int pulses_signs)
> +{
> + int index;
> +
> + index = ((5 * gray_decode[fc_index & 15]) >> 1) + 1;
> + fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192; // +/-1 in (2.13)
> +
> + pulses_signs >>= 1;
> + fc_index >>= 4;
> +
> + index = fc_2pulses_9bits_track2[gray_decode[fc_index & 31]];
> + fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192; // +/-1 in (2.13)
> +}
The only thing that seperates these is the different tables
for(i=0; i<pulse_count; i++){
index = i + tab[fc_index & mask];
fc_v[ index ] += (pulses_signs & 1) ? 8191 : -8192;
fc_index >>= bits;
pulses_signs >>= 1;
}
index = tab2[fc_index];
fc_v[ index ] += pulses_signs ? 8191 : -8192;
This also means you no longer need the function pointers.
> +
> +void ff_acelp_fc_enchance_harmonics(
> + int16_t* fc_v,
> + int pitch_delay,
> + int16_t gain_pitch,
> + int length)
> +{
> + int i;
> +
> + for(i=pitch_delay; i<length;i++)
> + fc_v[i] += (fc_v[i - pitch_delay] * gain_pitch) >> 14;
> +}
> +
> +void ff_acelp_build_excitation(
> + int16_t* exc,
> + const int16_t *ac_v,
> + const int16_t *fc_v,
> + int16_t gp,
> + int16_t gc,
> + int subframe_size)
> +{
> + int i;
> +
> + //clipping required here. breaks OVERFLOW test
> + for(i=0; i<subframe_size; i++)
> + exc[i] = av_clip_int16((ac_v[i] * gp + fc_v[i] * gc + 0x2000) >> 14);
> +}
Does it really makes sense to put 1 line loops in functions?
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Dictatorship naturally arises out of democracy, and the most aggravated
form of tyranny and slavery out of the most extreme liberty. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080503/44a6faa3/attachment.pgp>
More information about the ffmpeg-devel
mailing list