[FFmpeg-devel] [PATCH] QCELP decoder
Kenan Gillet
kenan.gillet
Fri Nov 14 21:17:50 CET 2008
On Nov 14, 2008, at 2:14 AM, Michael Niedermayer wrote:
> On Thu, Nov 13, 2008 at 11:46:40AM -0800, Kenan Gillet wrote:
>> Thanks for your reviews,
>>
>> here is round 10 of the QCELP decoder patch:
>> - it simplifies and optimizes qcelp_lspf2lpc and lps2poly
>> - it removes bandwith_expansion_coeff from QCELPContext, and
>> replaces it
>> by a systematic recalculation as the benchmarks show it is faster
>> - simplifies apply_gain_control,
>> - adds FIXME comment an av_log_missing_features in apply_gain_control
>> if the vector to control gain of is a zero vector.
>> - various cosmetics
>>
>> Kenan
>>
[...]
>
>> +
>> +/**
>> + * Decodes the 10 quantized LSP frequencies from the LSPV/LSP
>> + * transmission codes of any framerate and checks for badly
>> received packets.
>> + *
>> + * @param q the context
>> + * @param lspf line spectral pair frequencies
>> + *
>> + * @return 0 on success, -1 if the packet is badly received
>> + *
>> + * TIA/EIA/IS-733 2.4.3.2.6.2-2, 2.4.8.7.3
>> + */
>> +static int decode_lspf(QCELPContext *q,
>> + float *lspf) {
>> + int i;
>> + float tmp_lspf;
>> +
>> + if (q->framerate == RATE_OCTAVE ||
>> + q->framerate == I_F_Q) {
>> + float smooth;
>
>> + const float *predictors = (q->prev_framerate !=
>> RATE_OCTAVE ||
>> + q->prev_framerate != I_F_Q ? q-
>> >prev_lspf
>> + : q-
>> >predictor_lspf);
>
> 3rd try.
> This is a constant, if you disagree tell us for which value of
> prev_framerate
> q->predictor_lspf is selected
It is, and I should crawl under a rock for that :(
Of course, it should be a &&
fixed
>
> [...]
>
>> +/**
>> + * Converts codebook transmission codes to GAIN and INDEX.
>> + *
>> + * @param q the context
>> + * @param gain array holding the decoded gain
>> + *
>> + * @return 0 on success, -1 if the gain is out of range for
>> RATE_QUARTER
>> + *
>> + * TIA/EIA/IS-733 2.4.6.2
>> + */
>> +static int decode_gain_and_index(QCELPContext *q,
>> + float *gain) {
>> + int i, subframes_count, g1[16];
>> + float ga[16], gain_memory, smooth_coef;
>> +
>> + if (q->framerate >= RATE_QUARTER) {
>> + subframes_count = q->framerate == RATE_FULL ? 16
>> + : q->framerate
>> == RATE_HALF ? 4
>> +
>> : 5
>> ;
>> + for (i = 0; i < subframes_count; i++) {
>> + g1[i] = 4 * q->cbgain[i];
>> + if (q->framerate == RATE_FULL && !((i+1) & 3)) {
>
>> + g1[i] += av_clip((g1[i-1] + g1[i-2] + g1[i-3]) /
>> 3, 6, 38) - 6;
>
> g1[i] += av_clip((g1[i-1] + g1[i-2] + g1[i-3]) / 3 - 6, 0, 32);
done
>> + if (g1[i] > 60)
>> + g1[i] = 60;
>
> i dont see how this can be true
removed,
I did not check that the cbgain for the RATE_FULL every 4th subframe
was only coded on 7 bits instead of 8.
so q->cbgain[i] is coded up to 3 bits so it can have a value up to 28,
which makes it 60 with the av_clip.
>
>
>> + } else if (q->framerate == RATE_QUARTER) {
>
>> + if (i > 0 && FFABS(g1[i] - g1[i-1]) > 40)
>> + return -1;
>> + if (i >= 2 && FFABS(g1[i] - 2 * g1[i-1] + g1[i-2])
>> > 48)
>> + return -1;
>
> iam not sure, but maybe this check should be seperate and closer
> to where the bits are read
done
>
>> + }
>> +
>> + gain[i] = qcelp_g12ga[g1[i]];
>> +
>> + if (q->cbsign[i]) {
>> + gain[i] = -gain[i];
>> + q->cindex[i] = (q->cindex[i]-89) & 127;
>> + }
>> + }
>> +
>> + q->prev_g1[0] = g1[i-2];
>> + q->prev_g1[1] = g1[i-1];
>
>> + q->last_codebook_gain = gain[i-1];
>
> q->last_codebook_gain = qcelp_g12ga[g1[i-1]];
> would avoid the FFABS() later
done
>
>> +
>> + if (q->framerate == RATE_QUARTER) {
>> + // Provide smoothing of the unvoiced excitation energy.
>> + gain[7] = gain[4];
>> + gain[6] = 0.4*gain[3] + 0.6*gain[4];
>> + gain[5] = gain[3];
>> + gain[4] = 0.8*gain[2] + 0.2*gain[3];
>> + gain[3] = 0.2*gain[1] + 0.8*gain[2];
>> + gain[2] = gain[1];
>> + gain[1] = 0.6*gain[0] + 0.4*gain[1];
>> + }
>> + } else {
>> + if (q->framerate == RATE_OCTAVE) {
>
>> + g1[0] = -4 + 2 * q->cbgain[0]
>> + + av_clip((q->prev_g1[0] + q->prev_g1[1]) / 2 -
>> 1, 4, 58);
>
> g1[0] = 2 * q->cbgain[0] + av_clip((q->prev_g1[0] + q->prev_g1[1]) /
> 2 - 5, 0, 54);
done
>> + smooth_coef = 0.125;
>> + i = 7;
>> + } else {
>> + assert(q->framerate == I_F_Q);
>> +
>> + g1[0] = q->prev_g1[1];
>> + switch (q->erasure_count) {
>> + case 1 : break;
>> + case 2 : g1[0] -= 1; break;
>> + case 3 : g1[0] -= 2; break;
>> + default: g1[0] -= 6;
>> + }
>> + if (g1[0] < 0)
>> + g1[0] = 0;
>> + smooth_coef = 0.25;
>> + i = 3;
>> + }
>> + ga[0] = qcelp_g12ga[g1[0]];
>
>> + gain_memory = FFABS(q->last_codebook_gain);
>
> this almost is qcelp_g12ga[q->prev_g1[1]]
> i assume the "almost" is not just a bug and it was intended to be
> always?
I double check and it is not, RATE_OCTAVE and I_F_Q interpolates
the gain from the current and previous frame, thus the almost.
>
>> +
>> + q->last_codebook_gain =
>> + gain[i] =
>> + ga[0] = 0.5 * (gain_memory + ga[0]);
>> +
>> + smooth_coef *= (ga[0] - gain_memory);
>
> storing in ga[0] is redundant
done
>
>> + // This interpolation is done to produce smoother
>> background noise.
>> + for (; i > 0; i--)
>> + gain[i-1] = gain_memory + smooth_coef * i;
>> +
>> + q->prev_g1[0] = q->prev_g1[1];
>> + q->prev_g1[1] = g1[0];
>> + }
>> + return 0;
>> +}
>> +
>
>
[...]
>
>> +
>> +/**
>> * Apply filter in pitch-subframe steps.
>> *
>> * @param memory buffer for the previous state of the filter
>> @@ -104,6 +425,70 @@
>> }
>>
>
>> /**
>> + * Apply pitch synthesis filter and pitch prefilter to the scaled
>> codebook vector.
>> + * TIA/EIA/IS-733 2.4.5.2
>> + *
>> + * @param q the context
>> + * @param cdn_vector the scaled codebook vector
>> + *
>> + * @return 0 on success, -1 if the lag is out of range
>> + */
>> +static int apply_pitch_filters(QCELPContext *q,
>> + float *cdn_vector) {
>> + int i;
>> + float gain[4];
>> + const float *v_synthesis_filtered, *v_pre_filtered;
>> +
>> + if (q->framerate >= RATE_HALF ||
>> + (q->framerate == I_F_Q && (q->prev_framerate >=
>> RATE_HALF))) {
>> +
>> + if (q->framerate >= RATE_HALF) {
>> +
>> + // Compute gain & lag for the whole frame.
>> + for (i = 0; i < 4; i++) {
>> + gain[i] = q->plag[i] ? (q->pgain[i] + 1) / 4.0 :
>> 0.0;
>> +
>> + q->plag[i] += 16;
>> +
>
>> + if (q->pfrac[i] && q->plag[i] >= 140)
>> + return -1;
>
> iam thinking that such bitstream checks should be closer to the
> bitstream
> decoding.
> this also would allow this function to have no return value
>
moved just after the unpacking of the frame.
[...]
>
>
>> @@ -152,11 +537,140 @@
>> return -1;
>> }
>>
>> +/*
>> + * Determine the framerate from the frame size and/or the first
>> byte of the frame.
>> + *
>> + * @param avctx the AV codec context
>> + * @param buf_size length of the buffer
>> + * @param buf the bufffer
>> + *
>> + * @return the framerate on success, RATE_UNKNOWN otherwise.
>> + */
>> +static int determine_framerate(AVCodecContext *avctx,
>> + const int buf_size,
>> + uint8_t **buf) {
>> + qcelp_packet_rate framerate;
>> +
>> + if ((framerate = buf_size2framerate(buf_size)) >= 0) {
>> + if (framerate != **buf) {
>
> iam not sure but didnt you at some point reorder the enum?
> if so how can this code be correct before and afterwards?
I reorder the enum on the 09/07/2008, way before submitting my first
patch to
RATE_FULL = 0,
RATE_HALF = 1,
RATE_QUARTER= 2,
RATE_OCTAVE = 3,
I_F_Q, /*!< insufficient frame quality */
BLANK,
RATE_UNKNOWN
to
SILENCE = 0,
RATE_OCTAVE,
RATE_QUARTER,
RATE_HALF,
RATE_FULL,
I_F_Q, /*!< insufficient frame quality */
RATE_UNKNOWN
in order to reflect the rate byte in the QCELP frame.
and I changed on the 10/27/2008 to
RATE_UNKNOWN = -2,
I_F_Q, /*!< insufficient frame quality */
SILENCE,
RATE_OCTAVE,
RATE_QUARTER,
RATE_HALF,
RATE_FULL
when you asked me to change the
switch (framerate)
case RATE_FULL:
case RATE_QUARTER:
case RATE_OCTAVE:
}
to (framerate >= RATE_QUARTER)
After sending the patch round 10, I also added a check to make sure
the buffer
contains enough data for the the frame to be decoded without reading
garbage.
Futhermore, I am dropping RATE_UNKNOWN and replace it by I_F_Q
since the specification says at TIA/EIA/IS-733 2.4.8.7.1:
"If the received packet type cannot be satisfactorily determined, the
multiplex sublayer
informs the receiving speech codec of an erasure (see 2.3.2.2)."
attached the round 11:
qcelp-round11-doc-glue.patch.txt and qcelp-round11-lsp.patch.txt are
identical to previous round.
More information about the ffmpeg-devel
mailing list