[FFmpeg-devel] [PATCH] avcodec/wmalosslessdec: real 24bit support

Thu Apr 14 07:56:12 CEST 2016

Hi,

2016-04-12 22:53 GMT+02:00 Paul B Mahol <onemda at gmail.com>:

> -    LLAudDSPContext dsp;                           ///< accelerated

And later:

> +static int scalarproduct_and_madd_int(int *v1, const int *v2,
> +                                      const int *v3,
> +                                      int order, int mul)
> +{
> +    int res = 0;
> +
> +    av_assert0(order >= 0);
> +    while (order--) {
> +        res   += *v1 * *v2++;
> +        *v1++ += mul * *v3++;
> +    }
> +    return res;
> +}

As Hendrik said, please move it to LLAudDSPContext.

On a side note, is this through RE or guess (I'm asking because the
guess was not difficult) ?

Because having an accumulator on 32 bits only may lead to overflows.

> +    int     mclms_prevvalues[WMALL_MAX_CHANNELS * 2 * 32];
> +    int     mclms_updates[WMALL_MAX_CHANNELS * 2 * 32];
>      int     mclms_recent;
>
>      int     movave_scaling;
> @@ -146,9 +144,9 @@ typedef struct WmallDecodeCtx {
>          int scaling;
>          int coefsend;
>          int bitsend;
> -        DECLARE_ALIGNED(16, int16_t, coefs)[MAX_ORDER +
> WMALL_COEFF_PAD_SIZE/sizeof(int16_t)];
> -        DECLARE_ALIGNED(16, int16_t, lms_prevvalues)[MAX_ORDER * 2 +
> WMALL_COEFF_PAD_SIZE/sizeof(int16_t)];
> -        DECLARE_ALIGNED(16, int16_t, lms_updates)[MAX_ORDER * 2 +
> WMALL_COEFF_PAD_SIZE/sizeof(int16_t)];
> +        int coefs[MAX_ORDER + WMALL_COEFF_PAD_SIZE];
> +        int lms_prevvalues[MAX_ORDER * 2 + WMALL_COEFF_PAD_SIZE];
> +        int lms_updates[MAX_ORDER * 2 + WMALL_COEFF_PAD_SIZE];

I prefer int32_t just because it's something to dspize. Plus at some
point someone would have to redo the alignment.

> -               sizeof(int16_t) * order * num_channels);
> +               sizeof(int) * order * num_channels);

The format has the bitdepth stored and put into s->bits_per_sample.
The decoder actually uses it to select how to store the samples later
on.
In any such case, this should be dynamic. Either you use int size =
s->bits_per_sample>16 ? 4 : 2 (because sizeof(int16_t isn't going to
change much...)

Or FFALIGN(s->bits_per_sample>>3, 2)? Whatever floats your boat.

> -            pred +=
> s->dsp.scalarproduct_and_madd_int16(s->cdlms[ch][ilms].coefs,
> -
> s->cdlms[ch][ilms].lms_prevvalues
> -                                                            +
> s->cdlms[ch][ilms].recent,
> -
> s->cdlms[ch][ilms].lms_updates
> -                                                            +
> s->cdlms[ch][ilms].recent,
> -
> FFALIGN(s->cdlms[ch][ilms].order,
> -
> WMALL_COEFF_PAD_SIZE),
> -                                                        WMASIGN(residue));
> +            pred += scalarproduct_and_madd_int(s->cdlms[ch][ilms].coefs,
> +
> s->cdlms[ch][ilms].lms_prevvalues
> +                                                   + s->cdlms[ch][ilms].recent,
> +                                               s->cdlms[ch][ilms].lms_updates
> +                                                   + s->cdlms[ch][ilms].recent,
> +                                               s->cdlms[ch][ilms].order,
> +                                               WMASIGN(residue));

And then here:
- switch based on bitdepth (the needed 'if' wouldn't be the end of the
world but it's not actually needed);
- or use a function pointer in the context

For the later point, unless going through a proxy, it may, obviously, look like:
int (*scalarproduct_and_madd_int)(void *v1, const void *v2,
                                      const void *v3, int order, int mul)
but there might be compilation warning on call or setting the variable.

-- 
Christophe