[FFmpeg-devel] [PATCH] TwinVQ decoder

Fri Jul 31 16:41:52 CEST 2009

On Wed, Jul 22, 2009 at 09:16:50AM +0200, Vitor Sessak wrote:
> New version attached. Changes:
>
> - Use DSP functions
> - Do not alloc big buffers on the stack
> - Some random optimizations
> - Diego's cosmetics suggestions

I have not looked at previous reviews...

> /**
>  * Parameters and tables that are different for each frame type
>  */
> typedef struct {
> } FrameMode;

Used only once, so IMO use only struct FrameMode and don't add a typedef.

> typedef struct {
>     const FrameMode fmode[3]; ///< frame type-dependant parameters

I think that const makes little sense, for the static tables it
is const because the overall  struct is const, and like this
it would make it impossible to initialize fmode.

> } ModeTab;

I'd avoid the typedef here, too.

> static const ModeTab mode_08_08 = {
> {{ 8, bark_tab_s08_64,  10, tab.fcb08s  , 1, 5, tab.cb0808s0, tab.cb0808s1, 18},
>  { 2, bark_tab_m08_256, 20, tab.fcb08m  , 2, 5, tab.cb0808m0, tab.cb0808m1, 16},
>  { 1, bark_tab_l08_512, 30, tab.fcb08l  , 3, 6, tab.cb0808l0, tab.cb0808l1, 17}
>  },
>  512 , 12, tab.lsp08,   1, 5, 3, 3, tab.shape08  , 8, 28, 20, 6, 40
> };

IMO try to use standard indentation.

> typedef struct TwinContext {
>     AVCodecContext *avctx;
>     DSPContext      dsp;
>     const ModeTab *mtab;
>     float lsp_hist[2][20];           ///< LSP coefficients of the last frame
>     int16_t permut[4][4096];
>     float bark_hist[3][2][40];       ///< BSE coefficients of last frame
>     float *prev;                     ///< spectrum of last frame
>     MDCTContext mdct_ctx[3];
>     GetBitContext gb;
>     uint8_t length[4][2];            ///< main codebook stride
>     uint8_t length_change[4];
>     uint8_t bits_main_spec[2][4][2]; ///< bits for the main codebook
>     int bits_main_spec_change[4];
>     int n_div[4];

Seems a bit chaotic to me with the MDCT and getBit contexts right
in the middle there...
Also I'd think about using an on-stack GetBitContext instead of having
it in the global context, iit just feels wrong to me like this.

>     float *out1;                     ///< non-interleaved output

out1 IMO is a really bad name.

> static void memset_float(float *buf, float val, int size)
> {
>     while (size--)
>         *buf++ = val;
> }

Things to check:
Is this performance-relevant?
What are the usual values for size?
Is size possibly always or often a multiple of some power of two?
Should it be inline?
Should the loop be (partially) unrolled?

> static void add_vec(int cont, const float *buf1, const float *buf2, float *buf3)
> {
>     while (cont--)
>         *buf3++ = *buf1++ + *buf2++;
> }

The same applies here, but you should also check if you can/should use restrict on
the pointer arguments.

>  * @param lsp a vector of the cosinus of the LSP values
>  * @param cos_val cos(PI*i/N) where i is the index of the LPC amplitude

The order argument is not documented, esp. its valid ranges/values.

> /**
>  * Evaluates the LPC amplitude spectrum envelope from the line spectrum pairs.
>  */
> static void eval_lpcenv(const float *cos_vals, float *lpc, TwinContext *tctx)

Putting the context at the end IMO is strange.

>     int i;
>     const ModeTab *mtab = tctx->mtab;
>     int size_s = mtab->size / mtab->fmode[FT_SHORT].sub;
>     int mag_95 = 2 * mtab->fmode[FT_SHORT].sub;
>     float *cos_TT = ff_cos_tabs[av_log2(mtab->size)-1];
> 
>     for (i=0; i < (size_s/2); i++) {

Useless ()

>         lpc[i]          = eval_lpc_spectrum(cos_vals,  cos_TT[mag_95*(2*i+1)],
>                                             mtab->n_lsp);
>         lpc[size_s-i-1] = eval_lpc_spectrum(cos_vals, -cos_TT[mag_95*(2*i+1)],
>                                             mtab->n_lsp);

The expression mag_95*(2*i+1) should probably a extra variable thhat gets
updated during the loop instead of recalculated each time.

> static void interpolate(float *out, float v1, float v2, int size)
> {
>     int j;
>     float step = (v1-v2)/size;
> 
>     out[0] = v2 + step;
>     for (j=1; j < size - 1; j++)
>         out[j] = out[j-1] + step;

Why j and not i?
Is out[size-1] intentionally not set?
Also
for (i = 0; i < size - 1; i++) {
  v2 += step;
  out[i] = v2;
}

Seems simpler to me.

> #define GET_COS(sub, idx, forward, cos_tab, size)  \
>     ((forward) ?                                   \
>      ( cos_tab[           (sub)*(idx)]) :          \
>      (-cos_tab[2*(size) - (sub)*(idx)]))

IMO unless it gives you real speed issue make that a static inline function,
in the worst case using av_always_inline.

>  * @param step the size of a block "siiiibiiii"
>  * @param in the cosinus of the LSP data
>  */
> static inline void eval_lpcenv_or_interp(float *out, const float *in,
>                                          int size, int step, int sub,
>                                          const ModeTab *mtab, int forward)

Several undocumented arguments.

>     // Fill the 'iiiibiiii'
>     for (i=step; i < size; i += step) {
>         if ((i == size - step) ||
>             (out[i + step] + out[i - step] > 1.95*out[i]) ||
>             (out[i - step]               <= out[i + step])) {

I think this assumes that size is a multiple of step and otherwise
may overread. This kind of thing absolutely _must_ be documented,
though it would be even better to avoid (e.g. using i >= size - step).
Also I think this condition is not particularly readable, it should
at least have a comment explaining it.
Performance-wise, checking for "i == size - step" in each iteration
when it is sure it will only be true the last time is not exactly
optimal either, esp. since I think it doesn't save you much complexity
code-wise anyway...

> static void dequant(float *out, enum FrameType ftype, const int16_t *cb0,
>                     const int16_t *cb1, int cb_len, TwinContext *tctx)

This and others I think lack doxygen comments.

>         int length = tctx->length[ftype][i >= tctx->length_change[ftype]];
> 
>         int bits = tctx->bits_main_spec[0][ftype]
>                                       [i >= tctx->bits_main_spec_change[ftype]];

IMHO you are packing a bit too much in one line.
Also, using a well-named variable for the result of the "i >= ..." result
will also give the reader a better idea of what you are doing.

>         if (bits == 7) {
>             if (get_bits1(&tctx->gb))
>                 sign0 = -1;
>             tmp0 = get_bits(&tctx->gb, 6);
>         } else
>             tmp0 = get_bits(&tctx->gb, bits);

bits = 6;
in the if and get rid of the else IMO.

>         if (bits == 7) {
>             if (get_bits1(&tctx->gb))
>                 sign1 = -1;
> 
>             tmp1 = get_bits(&tctx->gb, 6);
>         } else
>             tmp1 = get_bits(&tctx->gb, bits);

Same.

>     int period = min_period +
>         ROUNDED_DIV(period_coef*(max_period - min_period),
>                     (1 << mtab->ppc_period_bit) - 1);

IMO too complex for a single statement.

>     if ((isampf == 22) && (ibps == 32)) {

pointless ().

>         // Stupid corner case
>         width = ROUNDED_DIV((period + 800)* mtab->peak_per2wid, 400*mtab->size);

The comment is not very helpful.

>     for (i=1; i<order; i++)
>         if (lsp[i-1] > lsp[i] - min_dist) {
>             lsp[i-1] = (lsp[i] + lsp[i-1] - min_dist) * 0.5;
>             lsp[i  ] = (lsp[i] + lsp[i-1] + min_dist) * 0.5;
>         }

Hm. I don't like it too much for some reason...
E.g. if (lsp[i] - lsp[i-1] < min_dist)
would seem more logical.
Also I think an extra variable might reduce possible confusion:
float new_prev = (lsp[i] + lsp[i-1] - min_dist) * 0.5;
float new_cur  = (lsp[i] + new_prev - min_dist) * 0.5;
lsp[i-1] = new_prev;
lsp[i]   = new_cur;
but not 100% sure.

> static void bubblesort(float *lsp, int lp_order)
> {
>     int i,j;
> 
>     /* sort lsp in ascending order. float bubble agorithm,
>        O(n) if data already sorted, O(n^2) - otherwise */
>     for (i=0; i<lp_order-1; i++)
>         for (j=i; j>=0 && lsp[j] > lsp[j+1]; j--)
>             FFSWAP(float, lsp[j], lsp[j+1]);
> }

Well, there is the standard C sort function I think, which might be just as well...
Remainder not fully reviewed....