[FFmpeg-devel] [PATCH] G.729A (now fixed-point) decoder

Tue Mar 25 13:54:58 CET 2008

Hi, Michael

On Mon, Mar 24, 2008 at 1:00 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Thu, Mar 20, 2008 at 10:12:54PM +0600, Vladimir Voroshilov wrote:
>  > Hi, Michael.
>  >
>  > On Thu, Mar 20, 2008 at 8:29 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>  > > On Sun, Mar 16, 2008 at 12:24:58AM +0600, Vladimir Voroshilov wrote:
>  > >
>  > >  > +Per-vector results (PASS means no hearable differences, FAILED means hearable artefacts):
>  > >  > +algthm  : PASS
>  > >  > +erasure : PASS
>  > >  > +fixed   : PASS
>  > >  > +lsp     : PASS
>  > >
>  > >  > +overflow: FAIL
>  > >
>  > >  fix it!
>  >
>  > "FAIL" here means 39 PSNR vs 65-88 in other tests.
>  > Searching for error still in progress, though.
>
>  just keep in mind this has to be fixed before it will hit svn!
>  Also i like to have binary identical output or an explanation why precissely
>  it is not.

Current code in my local tree is bit-exact :)
All overflows were found and fixed.

>  > +    int subframe_idx;           ///< subframe index (for debugging)
>
>  unused remove it!

Will be removed in next patch. I'll use md5 sums from now.

>  > +/* 4.2.2 */
>  > +#define GAMMA_N 18022 //0.55 in Q15
>  > +#define GAMMA_D 22938 //0.70 in Q15
>
>  s/N/NUM/
>  s/D/DEN/
>  s/GAMMA/FORMANT_PP_FACTOR/

FORMANT_PP_FACTOR_NUM
FORMANT_PP_FACTOR_DEN
?

What about GAMMA_P ;) ?

>  [...]
>  > +/**
>  > + * GA codebook (3.9.2)
>  > + */
>
>  Do you know what GA stands for? hint: gain
>  Write it in the comment!

"GA codebook "->"Gain codebook (first stage)"
"GB codebook "->"Gain codebook (second stage)"

and

cb_GA->cb_gain_1st
cb->GB->cb_gain_2nd

respectively. Ok?

>
>
>  [...]
>
>  > +/**
>  > + * MA predictor (3.2.4)
>  > + */
>
>  do you know what "MA" stands for?
>  I dont, but you should write it in the comment! Its more usefull
>  than to repeat the abbreviation of the variable name.

I found it!
"Moving Average (MA) predictor codebook"

>  [...]
>  > +/**
>  > + * \brief divide two fixed point numbers
>
> > + * \param num numenator
>  > + * \param denom denumenator
>  > + * \param base base to scale result to
>  > + *
>  > + * \return result of division scaled to given base
>  > + */
>  > +int l_div(int num, int denom, uint8_t base)
>  > +{

> > [...]
>
>  you can also get rid of the >> 31
>  and use if(sig<0)
>
>  and the assert(denom) is useless, it will kill the prog with a div by
>  zero anyway, this isnt float stuff with infinity.
>

l_div is gone away and was replaced with "int
div_int16(int16_t,int16_t)" similar to
{
 return ((int32)num)<<15)/denom;
}
which uses two positive numbers (num<denom) as input.

>  [...]
>  > +/**
>  > + * \brief fixed codebook vector modification if delay is less than 40 (4.1.4 and 3.8)
>
>  really? it seems you apply it always?

Correct phrase should be "fixed codebook vector modification if delay
is less than subframe length"
Applied only if pitch_delay<length (look into loop counter initialization)

>  [...]
>  > +
>  > +/**
>  > + * \brief Decoding of the adaptive codebook gain (4.1.5 and 3.9.1)
>
> > + * \param ctx private data structure
>  > + * \param ga_cb_index GA gain codebook index (stage 2)
>  > + * \param gb_cb_index GB gain codebook (stage 2)
>
> > + * \param fc_v (Q13) fixed-codebook vector
>  > + * \param pred_energ_q [in/out] (Q10) past quantized energies
>
> > + * \param subframe_size length of subframe
>  > + *
>  > + * \return (Q1) quantized adaptive-codebook gain (gain code)
>  > + */
>  > +static int16_t g729_get_gain_code(int ga_cb_index, int gb_cb_index, const int16_t* fc_v, int16_t* pred_energ_q, int subframe_size)
>  > +{
>
>  This predicts the gain and then "adds" the correction from the bitstream.
>  maybe that should be split in 2 functions

Do you mean code under after "shift prediction error vector" ?
this is hard to move that 3 lines out of this routine.
There is no other "adds" in this routine.

>  > +
>  > +    return 0;
>  > +}
>
>  > +
>  > +/**
>  > + * \brief Adaptive gain control (4.2.4)
>  > + * \param gain_before (Q0) gain of speech before applying postfilters
>  > + * \param gain_after  (Q0) gain of speech after applying postfilters
>  > + * \param speech [in/out] (Q0) signal buffer
>
> > + * \param subframe_size length of subframe
>  > + * \param gain_prev previous value of gain coefficient
>  > + *
>  > + * \return last value of gain coefficient
>  > + */
>  > +static int16_t g729a_adaptive_gain_control(int gain_before, int gain_after, int16_t *speech, int subframe_size, int16_t gain_prev)
>  > +{
>  > +    int gain; // Q12
>  > +    int n;
>  > +
>  > +    if(!gain_after)
>  > +        return gain_prev;
>  > +
>  > +    if(gain_before)
>  > +    {
>  > +        gain = l_div(gain_after, gain_before, 12); // Q12
>  > +        gain = l_inv_sqrt(gain) >> 11; // Q23 -> Q12
>  > +    }
>  > +    else
>  > +        gain = 0;
>
> > +
>  > +    for(n=0; n<subframe_size; n++)
>  > +    {
>  > +        // 0.9 * ctx->gain_coeff + 0.1 * gain
>
>  > +        gain_prev = (29491 * gain_prev + 3276 * gain) >> 15;
>
>  the >>15 are done at the wrong spot

"3276 * gain >> 15" is calculated before loop now (due to constant value).
line changed to  "gain_prev = gain + ((29491 * gain_prev) >> 15)"

>  > +/**
>  > + * \brief Calculates coefficients of weighted A(z/GAMMA) filter
>  > + * \param Az (Q12) source filter
>  > + * \param gamma (Q15) weight coefficients
>  > + * \param Azg [out] (Q12) resulted weighted A(z/GAMMA) filter
>  > + *
>  > + * Azg[i]=GAMMA^i*Az[i] , i=0..subframe_size
>  > + */
>  > +static void g729a_weighted_filter(const int16_t* Az, int16_t gamma, int16_t *Azg)
>
>  What is gamma? This is not a greek book, gamma identifies a letter. But this
>  clearly is not about that letter.

factor and "weight factor of postfiltering". Ok?

>  > +/**
>  > + * \brief compensates the tilt in the short-term postfilter (4.2.3)
>
> > + * \param ctx private data structure
>  > + * \param lp_gn (Q12) coefficients of A(z/GAMMA_N) filter
>  > + * \param lp_gd (Q12) coefficients of A(z/GAMMA_D) filter
>  > + * \param res_pst [in/out] (Q0) residual signal (partially filtered)
>  > +*/
>
> > +static void g729a_tilt_compensation(G729A_Context *ctx, const int16_t *lp_gn, const int16_t *lp_gd, int16_t* res_pst)
>  > +{
>  > +    int tmp;
>  > +    int gt;      // Q12
>  > +    int rh1,rh0; // Q12
>  > +    int16_t hf_buf[11+22]; // Q12 A(Z/GAMMA_N)/A(z/GAMMA_D) filter impulse response
>  > +    int sum;
>  > +    int i, n;
>  > +
>  > +    memset(hf_buf, 0, 33 * sizeof(int16_t));
>  > +
>
>  > +    hf_buf[10] = 4096; //1.0 in Q12
>
> > +    for(i=0; i<10; i++)
>  > +        hf_buf[i+11] = lp_gn[i];
>  > +
>  > +    /* Apply 1/A(z/GAMMA_D) filter to hf */
>  > +    for(n=0; n<22; n++)
>  > +    {
>  > +        sum=hf_buf[n+10];
>
> > +        for(i=0; i<10; i++)
>  > +            sum -= (lp_gd[i] * hf_buf[n+10-i-1]) >> 12;
>  > +        hf_buf[n+10] = sum;
>  > +    }
>
>  for(n=0; n<22; n++){
>     sum= lp_gn[n];
>
>     for(i=0; i<10; i++)
>         sum -= (lp_gd[i] * hf_buf[n+10-i-1]) >> 12;
>     hf_buf[n+10] = sum;
>  }

Wrong. Out of bounds in lp_gn
Or you mean extend lp_gn to 22 items ?
I don't see here any advantage.

>  [...]
>  > +/**
>  > + * \brief save quantized LSF coefficients for using in next frame
>  > + * \param ctx context structure
>  > + * \param lq quantized LSF coefficients
>  > + */
>  > +static void g729_lq_rotate(G729A_Context *ctx, int* lq)
>  > +{
>  > +    int i, k;
>
> > +
>
>  > +    /* Rotate lq_prev */
>  > +    for(i=0; i<10; i++)
>  > +    {
>  > +        for(k=MA_NP-1; k>0; k--)
>  > +            ctx->lq_prev[k][i] = ctx->lq_prev[k-1][i];
>  > +
>  > +        ctx->lq_prev[0][i] = lq[i];
>  > +    }
>
>  reorder and use memcpy()

done

>  also pass the arrays not the context as parameters, this is cleaner IMHO
>  and makes it more clear what is changed

when i pass lq_prev parameter as "int16**" program segfaults.
Perhaps int[][] array can not be passed as int**.

>  > +        if(!i)
>  > +        {
>  > +            // Decoding of the adaptive-codebook vector delay for first subframe (4.1.3)
>  > +            if(ctx->bad_pitch || ctx->data_error)
>  > +                pitch_delay_3x = 3 * ctx->pitch_delay_int_prev + 1;
>  > +            else
>  > +            {
>
>  > +                if(parm->ac_index[i] >= 197)
>  > +                    pitch_delay_3x = 3 * parm->ac_index[i] - 335;
>  > +                else
>  > +                    pitch_delay_3x = parm->ac_index[i] + 59;
>
>  not sure but somehow the follow looks nicer to me
>
>  pitch_delay_3x = parm->ac_index[i] + 59;
>  if(pitch_delay_3x > 255)
>     pitch_delay_3x = 3 * pitch_delay_3x - 512;

Cool!

>  [...]
>  > +            g729_update_gain_erasure(ctx->pred_energ_q);
>  > +        }
>  > +        else
>  > +        {
>  > +            // Decoding of the fixed codebook gain (4.1.5 and 3.9.1)
>  > +            ctx->gain_pitch = cb_GA[parm->ga_cb_index[i]][0] + cb_GB[parm->gb_cb_index[i]][0];
>  > +
>  > +            ctx->gain_code = g729_get_gain_code(parm->ga_cb_index[i],
>  > +                    parm->gb_cb_index[i],
>
>  besides the very ugly misalignment of the parameters
>  it woudl be MUCH cleaer to look up all values outside, not look up the first 2
>  outside and the next 2 inside the function

1. I used 8-spaces shift, what would be better ?
2. What "first 2" and "next 2" values do you mean ?

After receiving answer about things above and fixing them I'll send
updated patch.
I'm afraid it will require reviewing from top to bottom due to number
of changes.

-- 
Regards,
Vladimir Voroshilov mailto:voroshil at gmail.com
JID: voroshil at gmail.com, voroshil at jabber.ru
ICQ: 95587719