[FFmpeg-devel] [PATCH] QCELP decoder

Fri Nov 14 21:17:50 CET 2008

On Nov 14, 2008, at 2:14 AM, Michael Niedermayer wrote:

> On Thu, Nov 13, 2008 at 11:46:40AM -0800, Kenan Gillet wrote:
>> Thanks for your reviews,
>>
>> here is round 10 of the QCELP decoder patch:
>> - it simplifies and optimizes qcelp_lspf2lpc and lps2poly
>> - it removes bandwith_expansion_coeff from QCELPContext, and  
>> replaces it
>>  by a systematic recalculation as the benchmarks show it is faster
>> - simplifies apply_gain_control,
>> - adds FIXME comment an av_log_missing_features in apply_gain_control
>>  if the vector to control gain of is a zero vector.
>> - various cosmetics
>>
>> Kenan
>>

[...]

>
>> +
>> +/**
>> + * Decodes the 10 quantized LSP frequencies from the LSPV/LSP
>> + * transmission codes of any framerate and checks for badly  
>> received packets.
>> + *
>> + * @param q the context
>> + * @param lspf line spectral pair frequencies
>> + *
>> + * @return 0 on success, -1 if the packet is badly received
>> + *
>> + * TIA/EIA/IS-733 2.4.3.2.6.2-2, 2.4.8.7.3
>> + */
>> +static int decode_lspf(QCELPContext *q,
>> +                       float *lspf) {
>> +    int i;
>> +    float tmp_lspf;
>> +
>> +    if (q->framerate == RATE_OCTAVE ||
>> +        q->framerate == I_F_Q) {
>> +        float smooth;
>
>> +        const float *predictors = (q->prev_framerate !=  
>> RATE_OCTAVE ||
>> +                                   q->prev_framerate != I_F_Q ? q- 
>> >prev_lspf
>> +                                                              : q- 
>> >predictor_lspf);
>
> 3rd try.
> This is a constant, if you disagree tell us for which value of  
> prev_framerate
> q->predictor_lspf is selected

It is, and I should crawl under a rock for that :(
Of course, it should be a &&
fixed

>
> [...]
>
>> +/**
>> + * Converts codebook transmission codes to GAIN and INDEX.
>> + *
>> + * @param q the context
>> + * @param gain array holding the decoded gain
>> + *
>> + * @return 0 on success, -1 if the gain is out of range for  
>> RATE_QUARTER
>> + *
>> + * TIA/EIA/IS-733 2.4.6.2
>> + */
>> +static int decode_gain_and_index(QCELPContext  *q,
>> +                                 float *gain) {
>> +    int   i, subframes_count, g1[16];
>> +    float ga[16], gain_memory, smooth_coef;
>> +
>> +    if (q->framerate >= RATE_QUARTER) {
>> +        subframes_count = q->framerate == RATE_FULL ? 16
>> +                                                    : q->framerate  
>> == RATE_HALF ? 4
>> + 
>>                                                                                 : 5 
>> ;
>> +        for (i = 0; i < subframes_count; i++) {
>> +            g1[i] = 4 * q->cbgain[i];
>> +            if (q->framerate == RATE_FULL && !((i+1) & 3)) {
>
>> +                g1[i] += av_clip((g1[i-1] + g1[i-2] + g1[i-3]) /  
>> 3, 6, 38) - 6;
>
> g1[i] += av_clip((g1[i-1] + g1[i-2] + g1[i-3]) / 3 - 6, 0, 32);

done

>> +                if (g1[i] > 60)
>> +                    g1[i] = 60;
>
> i dont see how this can be true

removed,
I did not check that the cbgain for the RATE_FULL every 4th subframe
was only coded on 7 bits instead of 8.
so q->cbgain[i] is coded up to 3 bits so it can have a value up to 28,
  which makes it 60 with the av_clip.

>
>
>> +            } else if (q->framerate == RATE_QUARTER) {
>
>> +                if (i > 0  && FFABS(g1[i] -     g1[i-1]) > 40)
>> +                    return -1;
>> +                if (i >= 2 && FFABS(g1[i] - 2 * g1[i-1] + g1[i-2])  
>> > 48)
>> +                    return -1;
>
> iam not sure, but maybe this check should be seperate and closer
> to where the bits are read

done

>
>> +            }
>> +
>> +            gain[i] = qcelp_g12ga[g1[i]];
>> +
>> +            if (q->cbsign[i]) {
>> +                gain[i] = -gain[i];
>> +                q->cindex[i] = (q->cindex[i]-89) & 127;
>> +            }
>> +        }
>> +
>> +        q->prev_g1[0] = g1[i-2];
>> +        q->prev_g1[1] = g1[i-1];
>
>> +        q->last_codebook_gain = gain[i-1];
>
> q->last_codebook_gain = qcelp_g12ga[g1[i-1]];
> would avoid the FFABS() later

done

>
>> +
>> +        if (q->framerate == RATE_QUARTER) {
>> +            // Provide smoothing of the unvoiced excitation energy.
>> +            gain[7] =     gain[4];
>> +            gain[6] = 0.4*gain[3] + 0.6*gain[4];
>> +            gain[5] =     gain[3];
>> +            gain[4] = 0.8*gain[2] + 0.2*gain[3];
>> +            gain[3] = 0.2*gain[1] + 0.8*gain[2];
>> +            gain[2] =     gain[1];
>> +            gain[1] = 0.6*gain[0] + 0.4*gain[1];
>> +        }
>> +    } else {
>> +        if (q->framerate == RATE_OCTAVE) {
>
>> +            g1[0] = -4 + 2 * q->cbgain[0]
>> +                  + av_clip((q->prev_g1[0] + q->prev_g1[1]) / 2 -  
>> 1, 4, 58);
>
> g1[0] = 2 * q->cbgain[0] + av_clip((q->prev_g1[0] + q->prev_g1[1]) /  
> 2 - 5, 0, 54);

done

>> +            smooth_coef = 0.125;
>> +            i = 7;
>> +        } else {
>> +            assert(q->framerate == I_F_Q);
>> +
>> +            g1[0] = q->prev_g1[1];
>> +            switch (q->erasure_count) {
>> +            case 1 : break;
>> +            case 2 : g1[0] -= 1; break;
>> +            case 3 : g1[0] -= 2; break;
>> +            default: g1[0] -= 6;
>> +            }
>> +            if (g1[0] < 0)
>> +                g1[0] = 0;
>> +            smooth_coef = 0.25;
>> +            i = 3;
>> +        }
>> +        ga[0] = qcelp_g12ga[g1[0]];
>
>> +        gain_memory = FFABS(q->last_codebook_gain);
>
> this almost is qcelp_g12ga[q->prev_g1[1]]
> i assume the "almost" is not just a bug and it was intended to be  
> always?

I double check and it is not, RATE_OCTAVE and I_F_Q interpolates
  the gain from the current and previous frame, thus the almost.

>
>> +
>> +        q->last_codebook_gain =
>> +                      gain[i] =
>> +                        ga[0] = 0.5 * (gain_memory + ga[0]);
>> +
>> +        smooth_coef *= (ga[0] - gain_memory);
>
> storing in ga[0] is redundant

done

>
>> +        // This interpolation is done to produce smoother  
>> background noise.
>> +        for (; i > 0; i--)
>> +            gain[i-1] = gain_memory + smooth_coef * i;
>> +
>> +        q->prev_g1[0] = q->prev_g1[1];
>> +        q->prev_g1[1] = g1[0];
>> +    }
>> +    return 0;
>> +}
>> +
>
>

[...]

>
>> +
>> +/**
>>  * Apply filter in pitch-subframe steps.
>>  *
>>  * @param memory buffer for the previous state of the filter
>> @@ -104,6 +425,70 @@
>> }
>>
>
>> /**
>> + * Apply pitch synthesis filter and pitch prefilter to the scaled  
>> codebook vector.
>> + * TIA/EIA/IS-733 2.4.5.2
>> + *
>> + * @param q the context
>> + * @param cdn_vector the scaled codebook vector
>> + *
>> + * @return 0 on success, -1 if the lag is out of range
>> + */
>> +static int apply_pitch_filters(QCELPContext *q,
>> +                               float *cdn_vector) {
>> +    int         i;
>> +    float       gain[4];
>> +    const float *v_synthesis_filtered, *v_pre_filtered;
>> +
>> +    if (q->framerate >= RATE_HALF ||
>> +       (q->framerate == I_F_Q && (q->prev_framerate >=  
>> RATE_HALF))) {
>> +
>> +        if (q->framerate >= RATE_HALF) {
>> +
>> +            // Compute gain & lag for the whole frame.
>> +            for (i = 0; i < 4; i++) {
>> +                gain[i] = q->plag[i] ? (q->pgain[i] + 1) / 4.0 :  
>> 0.0;
>> +
>> +                q->plag[i] += 16;
>> +
>
>> +                if (q->pfrac[i] && q->plag[i] >= 140)
>> +                    return -1;
>
> iam thinking that such bitstream checks should be closer to the  
> bitstream
> decoding.
> this also would allow this function to have no return value
>

moved just after the unpacking of the frame.

[...]

>
>
>> @@ -152,11 +537,140 @@
>>     return -1;
>> }
>>
>> +/*
>> + * Determine the framerate from the frame size and/or the first  
>> byte of the frame.
>> + *
>> + * @param avctx the AV codec context
>> + * @param buf_size length of the buffer
>> + * @param buf the bufffer
>> + *
>> + * @return the framerate on success, RATE_UNKNOWN otherwise.
>> + */
>> +static int determine_framerate(AVCodecContext *avctx,
>> +                               const int buf_size,
>> +                               uint8_t **buf) {
>> +    qcelp_packet_rate framerate;
>> +
>> +    if ((framerate = buf_size2framerate(buf_size)) >= 0) {
>> +        if (framerate != **buf) {
>
> iam not sure but didnt you at some point reorder the enum?
> if so how can this code be correct before and afterwards?

I reorder the enum on the 09/07/2008, way before submitting my first  
patch to
     RATE_FULL   = 0,
     RATE_HALF   = 1,
     RATE_QUARTER= 2,
     RATE_OCTAVE = 3,
     I_F_Q,          /*!< insufficient frame quality */
     BLANK,
     RATE_UNKNOWN
to
     SILENCE = 0,
     RATE_OCTAVE,
     RATE_QUARTER,
     RATE_HALF,
     RATE_FULL,
     I_F_Q,          /*!< insufficient frame quality */
     RATE_UNKNOWN
in order to reflect the rate byte in the QCELP frame.

and I changed on the 10/27/2008 to
     RATE_UNKNOWN = -2,
     I_F_Q,             /*!< insufficient frame quality */
     SILENCE,
     RATE_OCTAVE,
     RATE_QUARTER,
     RATE_HALF,
     RATE_FULL
when you asked me to change the
switch (framerate)
   case RATE_FULL:
   case RATE_QUARTER:
   case RATE_OCTAVE:
}
to (framerate >= RATE_QUARTER)

After sending the patch round 10, I also added a check to make sure  
the buffer
contains enough data for the the frame to be decoded without reading  
garbage.

Futhermore, I am dropping RATE_UNKNOWN and replace it by I_F_Q
  since the specification says at TIA/EIA/IS-733 2.4.8.7.1:
"If the received packet type cannot be satisfactorily determined, the  
multiplex sublayer
informs the receiving speech codec of an erasure (see 2.3.2.2)."

attached the round 11:

qcelp-round11-doc-glue.patch.txt and qcelp-round11-lsp.patch.txt are
identical to previous round.