[FFmpeg-devel] [PATCH]HE-AACv1 try 3 (all missing functionality added)

Sat Feb 13 09:05:08 CET 2010

On Feb 12, 2010, at 3:15 PM, Alex Converse wrote:

> Notes:
> *A filterbank that supports float_to_int16_c as been added
> *All the computation time is spent in ff_sbr_apply() and it's
> children. If it isn't called from ff_sbr_apply() making it 100% faster
> isn't going to buy us anything.
> *No calls to lrintf depend on rounding behavior at (2*n+1)*0.5
> *Some sample SIMD placeholders are attached as a second patch.
> *Right now the synthesis filterbank is written on top on an MDCT. With
> appropriate SIMD functions it may make sense to move it to an FFT.
> Right now the MDCT version is much faster.
> *The analysis filterbank has been switched from an FFT to an RDFT.
> <sbr.diff>

> +/// constant to avoid division by zero, e.g. 96 dB below maximum signal input
> +#define EPS0 0.000000000001

FLT_EPSILON

> +    for (n = 0; n < 64; n++) {
> +        float pre = M_PI * n / 64;
> +        analysis_cos_pre[n] = cos(pre);
> +        analysis_sin_pre[n] = sin(pre);
> +    }
> +    for (k = 0; k < 32; k++) {
> +        float post = M_PI * (k + 0.5) / 128;
> +        analysis_cossin_post[k][0] =  2.0 * cos(post);
> +        analysis_cossin_post[k][1] = -2.0 * sin(post);
> +    }

cosf()/sinf()

> +    if (sbr->sample_rate < 32000) {
> +        temp = 3000;
> +    } else if (sbr->sample_rate < 64000) {
> +        temp = 4000;
> +    } else
> +        temp = 5000;

Nitpick: either do braces or no braces. (ignore that if there was already some discussion about it)

> +    // temp == max number of QMF subbands
> +    if (sbr->sample_rate <= 32000) {
> +        temp = 48;
> +    } else if (sbr->sample_rate == 44100) {
> +        temp = 35;
> +    } else if (sbr->sample_rate >= 48000)
> +        temp = 32;

Same.

> +        ch_data->bs_var_bord[1] = get_bits(gb, 2);
> +        ch_data->bs_num_rel[1]  = get_bits(gb, 2);
> +        ch_data->bs_num_env[1] = ch_data->bs_num_rel[1] + 1;

Misaligned =.

> +    ch_data->bs_num_noise = ch_data->bs_num_env[1] > 1 ? 2 : 1;

1 + (ch_data->bs_num_env[1] > 1)

> +    for (i = 0; i < ch_data->bs_num_env[1]; i++)
> +        ch_data->bs_df_env[i] = get_bits1(gb);
> +    for (i = 0; i < ch_data->bs_num_noise; i++)
> +        ch_data->bs_df_noise[i] = get_bits1(gb);

Misaligned =.

> +                sbr->data[0].env_facs[l][k] = temp1 / (1.0f + exp2f( temp2));
> +                sbr->data[1].env_facs[l][k] = temp1 / (1.0f + exp2f(-temp2));

Can't exp2f(x), exp2f(-x) be combined?

> +        dk = phi[2][1][0] * phi[1][0][0] -
> +               (phi[1][1][0] * phi[1][1][0] + phi[1][1][1] * phi[1][1][1]) / 1.000001f;

> +            alpha1[k][0] = temp_real / dk;
> +            alpha1[k][1] = temp_im   / dk;

> +            alpha0[k][0] = -temp_real / phi[1][0][0];
> +            alpha0[k][1] = -temp_im   / phi[1][0][0];

> +            const int env_size = 2 * (ch_data->t_env[l + 1] - ch_data->t_env[l]);
> +                e_curr[l][m] = sum / env_size;

> +                const int den = env_size * (table[p + 1] - table[p]);
> +                sum /= den;

Use reciprocals for all three of those if you can.
Also, can env_size/den be 0 with invalid input?

Don't see anything else.