[FFmpeg-devel] [PATCH] G.729A (now fixed-point) decoder
Vladimir Voroshilov
voroshil
Tue Mar 25 13:54:58 CET 2008
Hi, Michael
On Mon, Mar 24, 2008 at 1:00 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Thu, Mar 20, 2008 at 10:12:54PM +0600, Vladimir Voroshilov wrote:
> > Hi, Michael.
> >
> > On Thu, Mar 20, 2008 at 8:29 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > > On Sun, Mar 16, 2008 at 12:24:58AM +0600, Vladimir Voroshilov wrote:
> > >
> > > > +Per-vector results (PASS means no hearable differences, FAILED means hearable artefacts):
> > > > +algthm : PASS
> > > > +erasure : PASS
> > > > +fixed : PASS
> > > > +lsp : PASS
> > >
> > > > +overflow: FAIL
> > >
> > > fix it!
> >
> > "FAIL" here means 39 PSNR vs 65-88 in other tests.
> > Searching for error still in progress, though.
>
> just keep in mind this has to be fixed before it will hit svn!
> Also i like to have binary identical output or an explanation why precissely
> it is not.
Current code in my local tree is bit-exact :)
All overflows were found and fixed.
> > + int subframe_idx; ///< subframe index (for debugging)
>
> unused remove it!
Will be removed in next patch. I'll use md5 sums from now.
> > +/* 4.2.2 */
> > +#define GAMMA_N 18022 //0.55 in Q15
> > +#define GAMMA_D 22938 //0.70 in Q15
>
> s/N/NUM/
> s/D/DEN/
> s/GAMMA/FORMANT_PP_FACTOR/
FORMANT_PP_FACTOR_NUM
FORMANT_PP_FACTOR_DEN
?
What about GAMMA_P ;) ?
> [...]
> > +/**
> > + * GA codebook (3.9.2)
> > + */
>
> Do you know what GA stands for? hint: gain
> Write it in the comment!
"GA codebook "->"Gain codebook (first stage)"
"GB codebook "->"Gain codebook (second stage)"
and
cb_GA->cb_gain_1st
cb->GB->cb_gain_2nd
respectively. Ok?
>
>
> [...]
>
> > +/**
> > + * MA predictor (3.2.4)
> > + */
>
> do you know what "MA" stands for?
> I dont, but you should write it in the comment! Its more usefull
> than to repeat the abbreviation of the variable name.
I found it!
"Moving Average (MA) predictor codebook"
> [...]
> > +/**
> > + * \brief divide two fixed point numbers
>
> > + * \param num numenator
> > + * \param denom denumenator
> > + * \param base base to scale result to
> > + *
> > + * \return result of division scaled to given base
> > + */
> > +int l_div(int num, int denom, uint8_t base)
> > +{
> > [...]
>
> you can also get rid of the >> 31
> and use if(sig<0)
>
> and the assert(denom) is useless, it will kill the prog with a div by
> zero anyway, this isnt float stuff with infinity.
>
l_div is gone away and was replaced with "int
div_int16(int16_t,int16_t)" similar to
{
return ((int32)num)<<15)/denom;
}
which uses two positive numbers (num<denom) as input.
> [...]
> > +/**
> > + * \brief fixed codebook vector modification if delay is less than 40 (4.1.4 and 3.8)
>
> really? it seems you apply it always?
Correct phrase should be "fixed codebook vector modification if delay
is less than subframe length"
Applied only if pitch_delay<length (look into loop counter initialization)
> [...]
> > +
> > +/**
> > + * \brief Decoding of the adaptive codebook gain (4.1.5 and 3.9.1)
>
> > + * \param ctx private data structure
> > + * \param ga_cb_index GA gain codebook index (stage 2)
> > + * \param gb_cb_index GB gain codebook (stage 2)
>
> > + * \param fc_v (Q13) fixed-codebook vector
> > + * \param pred_energ_q [in/out] (Q10) past quantized energies
>
> > + * \param subframe_size length of subframe
> > + *
> > + * \return (Q1) quantized adaptive-codebook gain (gain code)
> > + */
> > +static int16_t g729_get_gain_code(int ga_cb_index, int gb_cb_index, const int16_t* fc_v, int16_t* pred_energ_q, int subframe_size)
> > +{
>
> This predicts the gain and then "adds" the correction from the bitstream.
> maybe that should be split in 2 functions
Do you mean code under after "shift prediction error vector" ?
this is hard to move that 3 lines out of this routine.
There is no other "adds" in this routine.
> > +
> > + return 0;
> > +}
>
> > +
> > +/**
> > + * \brief Adaptive gain control (4.2.4)
> > + * \param gain_before (Q0) gain of speech before applying postfilters
> > + * \param gain_after (Q0) gain of speech after applying postfilters
> > + * \param speech [in/out] (Q0) signal buffer
>
> > + * \param subframe_size length of subframe
> > + * \param gain_prev previous value of gain coefficient
> > + *
> > + * \return last value of gain coefficient
> > + */
> > +static int16_t g729a_adaptive_gain_control(int gain_before, int gain_after, int16_t *speech, int subframe_size, int16_t gain_prev)
> > +{
> > + int gain; // Q12
> > + int n;
> > +
> > + if(!gain_after)
> > + return gain_prev;
> > +
> > + if(gain_before)
> > + {
> > + gain = l_div(gain_after, gain_before, 12); // Q12
> > + gain = l_inv_sqrt(gain) >> 11; // Q23 -> Q12
> > + }
> > + else
> > + gain = 0;
>
> > +
> > + for(n=0; n<subframe_size; n++)
> > + {
> > + // 0.9 * ctx->gain_coeff + 0.1 * gain
>
> > + gain_prev = (29491 * gain_prev + 3276 * gain) >> 15;
>
> the >>15 are done at the wrong spot
"3276 * gain >> 15" is calculated before loop now (due to constant value).
line changed to "gain_prev = gain + ((29491 * gain_prev) >> 15)"
> > +/**
> > + * \brief Calculates coefficients of weighted A(z/GAMMA) filter
> > + * \param Az (Q12) source filter
> > + * \param gamma (Q15) weight coefficients
> > + * \param Azg [out] (Q12) resulted weighted A(z/GAMMA) filter
> > + *
> > + * Azg[i]=GAMMA^i*Az[i] , i=0..subframe_size
> > + */
> > +static void g729a_weighted_filter(const int16_t* Az, int16_t gamma, int16_t *Azg)
>
> What is gamma? This is not a greek book, gamma identifies a letter. But this
> clearly is not about that letter.
factor and "weight factor of postfiltering". Ok?
> > +/**
> > + * \brief compensates the tilt in the short-term postfilter (4.2.3)
>
> > + * \param ctx private data structure
> > + * \param lp_gn (Q12) coefficients of A(z/GAMMA_N) filter
> > + * \param lp_gd (Q12) coefficients of A(z/GAMMA_D) filter
> > + * \param res_pst [in/out] (Q0) residual signal (partially filtered)
> > +*/
>
> > +static void g729a_tilt_compensation(G729A_Context *ctx, const int16_t *lp_gn, const int16_t *lp_gd, int16_t* res_pst)
> > +{
> > + int tmp;
> > + int gt; // Q12
> > + int rh1,rh0; // Q12
> > + int16_t hf_buf[11+22]; // Q12 A(Z/GAMMA_N)/A(z/GAMMA_D) filter impulse response
> > + int sum;
> > + int i, n;
> > +
> > + memset(hf_buf, 0, 33 * sizeof(int16_t));
> > +
>
> > + hf_buf[10] = 4096; //1.0 in Q12
>
> > + for(i=0; i<10; i++)
> > + hf_buf[i+11] = lp_gn[i];
> > +
> > + /* Apply 1/A(z/GAMMA_D) filter to hf */
> > + for(n=0; n<22; n++)
> > + {
> > + sum=hf_buf[n+10];
>
> > + for(i=0; i<10; i++)
> > + sum -= (lp_gd[i] * hf_buf[n+10-i-1]) >> 12;
> > + hf_buf[n+10] = sum;
> > + }
>
> for(n=0; n<22; n++){
> sum= lp_gn[n];
>
> for(i=0; i<10; i++)
> sum -= (lp_gd[i] * hf_buf[n+10-i-1]) >> 12;
> hf_buf[n+10] = sum;
> }
Wrong. Out of bounds in lp_gn
Or you mean extend lp_gn to 22 items ?
I don't see here any advantage.
> [...]
> > +/**
> > + * \brief save quantized LSF coefficients for using in next frame
> > + * \param ctx context structure
> > + * \param lq quantized LSF coefficients
> > + */
> > +static void g729_lq_rotate(G729A_Context *ctx, int* lq)
> > +{
> > + int i, k;
>
> > +
>
> > + /* Rotate lq_prev */
> > + for(i=0; i<10; i++)
> > + {
> > + for(k=MA_NP-1; k>0; k--)
> > + ctx->lq_prev[k][i] = ctx->lq_prev[k-1][i];
> > +
> > + ctx->lq_prev[0][i] = lq[i];
> > + }
>
> reorder and use memcpy()
done
> also pass the arrays not the context as parameters, this is cleaner IMHO
> and makes it more clear what is changed
when i pass lq_prev parameter as "int16**" program segfaults.
Perhaps int[][] array can not be passed as int**.
> > + if(!i)
> > + {
> > + // Decoding of the adaptive-codebook vector delay for first subframe (4.1.3)
> > + if(ctx->bad_pitch || ctx->data_error)
> > + pitch_delay_3x = 3 * ctx->pitch_delay_int_prev + 1;
> > + else
> > + {
>
> > + if(parm->ac_index[i] >= 197)
> > + pitch_delay_3x = 3 * parm->ac_index[i] - 335;
> > + else
> > + pitch_delay_3x = parm->ac_index[i] + 59;
>
> not sure but somehow the follow looks nicer to me
>
> pitch_delay_3x = parm->ac_index[i] + 59;
> if(pitch_delay_3x > 255)
> pitch_delay_3x = 3 * pitch_delay_3x - 512;
Cool!
> [...]
> > + g729_update_gain_erasure(ctx->pred_energ_q);
> > + }
> > + else
> > + {
> > + // Decoding of the fixed codebook gain (4.1.5 and 3.9.1)
> > + ctx->gain_pitch = cb_GA[parm->ga_cb_index[i]][0] + cb_GB[parm->gb_cb_index[i]][0];
> > +
> > + ctx->gain_code = g729_get_gain_code(parm->ga_cb_index[i],
> > + parm->gb_cb_index[i],
>
> besides the very ugly misalignment of the parameters
> it woudl be MUCH cleaer to look up all values outside, not look up the first 2
> outside and the next 2 inside the function
1. I used 8-spaces shift, what would be better ?
2. What "first 2" and "next 2" values do you mean ?
After receiving answer about things above and fixing them I'll send
updated patch.
I'm afraid it will require reviewing from top to bottom due to number
of changes.
--
Regards,
Vladimir Voroshilov mailto:voroshil at gmail.com
JID: voroshil at gmail.com, voroshil at jabber.ru
ICQ: 95587719
More information about the ffmpeg-devel
mailing list