[FFmpeg-devel] [PATCH] g722 decoder, no licensing fame
Michael Niedermayer
michaelni
Fri Apr 10 05:22:48 CEST 2009
On Thu, Apr 09, 2009 at 08:14:58AM -0700, Kenan Gillet wrote:
> On Thu, Apr 9, 2009 at 12:12 AM, Diego Biurrun <diego at biurrun.de> wrote:
> > On Wed, Apr 08, 2009 at 09:23:51PM -0700, Kenan Gillet wrote:
> >>
> >> here is a version 3 of the patch:
> >> - it includes the Diego's suggested changes
> >
> > It does not.
>
> darn, i double checked the changes, and just ended up attaching the
> wrong file :(
>
> anyway correct patch attached,
[...]
> + * @note For the 56000bps and 48000bps bitrates, the respectively 7 and 6 bits
> + * codeword might be packed, so unpacking might be needed either
> + * internally or as a separate parser.
do non packed cases exist anywhere in the wild?
do packed cases exist anywhere in the wild?
it seems to me that filling 30% of the bits with 0 is not particlarely
likely in a real world environment
> + */
> +
> +#include <stdint.h>
> +#include "avcodec.h"
> +#include "mathops.h"
> +
> +typedef struct {
> + int bits_per_sample; ///< 6 for 48000kbps, 7 for 56000kbps, or 8 for 64000kbps
> +
> + int16_t prev_samples[24]; ///< memory of past 24 received (decoded) samples
> +
> + /**
> + * The band[0] and band[1] correspond respectively to the lower band and higher band.
> + */
> + struct G722Band {
> + int16_t s_predictor; ///< predictor output value
ok that makes sense
> + int32_t s_zero; ///< zero section output signal
what is that?
> + int8_t part_reconst_mem[2]; ///< partially reconstructed signal memory
and that?
> + int16_t prev_qtzd_reconst; ///< previous quantized reconstructed signal
this is not true, this uses a different dequantization table in some cases
i even wonder if this is a bug
> + int16_t pole_mem[2]; ///< second-order pole section coefficient buffer
> + int32_t diff_mem[6]; ///< quantizer difference signal memory
> + int16_t zero_mem[6]; ///< Seventh-order zero section coefficient buffer
> + int16_t log_factor; ///< delayed logarithmic quantizer factor
log what? log_e log_2 log_10 ?
> + int16_t scale_factor; ///< delayed quantizer scale factor
> + } band[2];
> +} G722Context;
> +
> +
> +static const int8_t sign_lookup[2] = { -1, 1 };
> +
> +static const int16_t ilb[32] = {
> + 2048, 2093, 2139, 2186, 2233, 2282, 2332, 2383,
> + 2435, 2489, 2543, 2599, 2656, 2714, 2774, 2834,
> + 2896, 2960, 3025, 3091, 3158, 3228, 3298, 3371,
> + 3444, 3520, 3597, 3676, 3756, 3838, 3922, 4008
> +};
> +static const int16_t wh[2] = { 798, -214 };
2 letters is better than 1 ... well sometimes its not much better though
> +static const int16_t qm2[4] = { -7408, -1616, 7408, 1616 };
> +/**
> + * qm3[index] == wl[rl42[index]]
> + */
> +static const int16_t qm3[16] = {
> + -60, 3042, 1198, 538, 334, 172, 58, -30,
> + 3042, 1198, 538, 334, 172, 58, -30, -60
> +};
> +static const int16_t qm4[16] = {
> + 0, -20456, -12896, -8968, -6288, -4240, -2584, -1200,
> + 20456, 12896, 8968, 6288, 4240, 2584, 1200, 0
> +};
what is qm?
i know these are dequantization tables, and at that poorly designed ones
(yeah the duplicate entries)
but i cant relate "qm" to "dequantization"
> +
> +/**
> + * quadrature mirror filter (QMF) coefficients
> + *
> + * ITU-T G.722 Table 11
> + */
> +static const int16_t qmf_coeffs[12] = {
> + 3, -11, 12, 32, -210, 951, 3876, -805, 362, -156, 53, -11,
> +};
> +
> +
> +/**
> + * adaptive predictor
> + *
> + * @note On x86 using the MULL macro in a loop is slower than not using the macro.
> + */
> +static void do_adaptive_prediction(struct G722Band *band, const int cur_diff)
missing doxy for cur_diff
[...]
> +static int inline scale(const int log_factor, int shift)
> +{
> + const int wd1 = ilb[(log_factor >> 6) & 31];
> + shift -= log_factor >> 11;
> + return (shift < 0 ? wd1 << -shift : wd1 >> shift) << 2;
> +}
i belive we already have some integer exp2() also this one lacks
documentation and a name that is related to what it does if there is
a reason why common code cannot be used
[...]
> +static void apply_qmf(int16_t *prev_samples, int *xout1, int *xout2)
> +{
> + int i;
> +
> + *xout1 = 0;
> + *xout2 = 0;
> + for (i = 0; i < 12; i++) {
> + MAC16(*xout2, prev_samples[2*i ], qmf_coeffs[i ]);
> + MAC16(*xout1, prev_samples[2*i+1], qmf_coeffs[11-i]);
> + }
> + memmove(prev_samples, prev_samples + 2, 22*sizeof(prev_samples[0]));
please find a solution that does not need to move the array by 2 samples
after each 2 samples.
[...]
> +static int g722_decode_frame(AVCodecContext *avctx, void *data, int *data_size, AVPacket *avpkt)
> +{
> + G722Context *c = avctx->priv_data;
> + int16_t *out_buf = data;
> + const uint8_t *buf = avpkt->data;
> + int j, out_len = 0;
> + const int shift = 8 - c->bits_per_sample;
> + const int16_t *quantizer_table = qms[shift];
> +
> + for (j = 0; j < avpkt->size; j++) {
> + const int ilow = buf[j] & (0x3F >> shift);
> + const int rlow = av_clip(MULL(c->band[0].scale_factor, quantizer_table[ilow], 15) +
> + c->band[0].s_predictor, -16384, 16383);
> +
> + update_low_predictor(&c->band[0], ilow >> (2 - shift));
> +
> + if (avctx->sample_rate == 16000) {
> + const int ihigh = (buf[j] >> (6 - shift)) & 0x03;
all that looks a litte obfuscated, get_bits() seems like the proper choice
[...]
> +static int g722_encode_frame(AVCodecContext *avctx,
> + uint8_t *dst, int buf_size, void *data)
> +{
> + G722Context *c = avctx->priv_data;
> + const int16_t *samples = data;
> +
> + int diff, limit;
> + int i, j;
> +
> + for (j = 0; j < buf_size; ) {
> + int xlow, rlow;
> + if (avctx->sample_rate == 16000) {
> + int xout1, xout2, xhigh, diff, pred, index;
> +
> + c->prev_samples[22] = samples[j++];
> + c->prev_samples[23] = samples[j++];
> + apply_qmf(c->prev_samples, &xout1, &xout2);
> + xlow = (xout1 + xout2) >> 14;
> + xhigh = (xout1 - xout2) >> 14;
> +
> + diff = av_clip_int16(xhigh - c->band[1].s_predictor);
> + pred = MULL(564, c->band[1].scale_factor, 12);
> + index = diff >= 0 ? (diff < pred) + 2
> + : diff >= -pred;
> +
> + update_high_predictor(&c->band[1], MULL(c->band[1].scale_factor, qm2[index], 15), index);
> +
> + *dst = (index << 6);
> + } else
> + xlow = samples[j++];
> +
> + diff = av_clip_int16(xlow - c->band[0].s_predictor);
> + limit = diff >= 0 ? diff : -(diff + 1);
> + for (i = 0; i < 29 && limit >= (q6[i]*c->band[0].scale_factor) >> 12; i++)
> + ;
> + *dst |=
> + rlow = (diff < 0 ? (i < 2 ? 63 : 33) : 61) - i;
i have good news and bad news
the good are that quality might be significatly improvable by using
trellis/viterbi, the bad news is that the encoder is rejected unless
you either implement that or proof that it cant be done
(and i dont mean full viterbi but keeping more than 1 possible
previous bitstream that is not just 1 previous and the best current
quantized value but several best past ones X several current quantized
values ...)
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Concerning the gods, I have no means of knowing whether they exist or not
or of what sort they may be, because of the obscurity of the subject, and
the brevity of human life -- Protagoras
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090410/97714d09/attachment.pgp>
More information about the ffmpeg-devel
mailing list