[FFmpeg-devel] [PATCH] wmapro decoder

Sun Aug 23 16:33:13 CEST 2009

On Fri, Aug 21, 2009 at 07:33:45PM +0200, Sascha Sommer wrote:
> Hi,
> 
> I attached an updated patch. As you might have already noticed, I do not have 
> much time to work on this project so please keep the focus on the important 
> things. I do not mind if Diego or someone else fixes the alignment, coding 
> style, typo and wording problems directly in the SVN sources or if these 
> things are pointed out in a single review but it is very frustrating to 
> resubmit this patch again and again and to synchonize the main and soc svn 
> for things that in the end do not give any real benefit. This is an 
> unacceptable waste of my time. Thanks.

[...]
> > > +        while (missing_samples > 0) {
> >
> > isnt that the same as a simple check on min_channel_len, which at the end
> > should be frame len?
> >
> 
> It is but I don't think the code will become cleaner when min_channel_len is 
> checked.

i think the whole tile code is messy and could be simplified, iam of course
not saying that above change would be a good idea but something should change
to make things simpler ...

> 
> > >
> > > +                if (channels_for_cur_subframe == 1 ||
> > > +                   min_samples == missing_samples)
> >
> > these 2 look redundant
> > also the condition for reading the mask could just be used instead of
> > the temporary var read_channel_mask
> 
> Did the second thing. Please explain why these 2 look redundant.
> 

i dont remember :(

[...]
> > > +                /** add subframes to the individual channels */
> > > +                if (min_channel_len == chan->channel_len) {
> > > +                    --channels_for_cur_subframe;
> > > +                    if (channel_mask & (1<<channels_for_cur_subframe)) {
> >
> > id do a get_bits1() here instead of loading it in a mask and then
> > extracting it
> > (btw you can just do GetBitContext mask_gb= *s->gb)
> 
> Then this would reintroduce the check for the case that the subframe is used 
> for all channels.

init_get_bits from a 0xFFFFFFFF, but its all of course just a suggestion if
it does improve code

[...]
> +/**
> + * @brief main decoder context
> + */
> +typedef struct WMA3DecodeContext {
> +    /* generic decoder variables */
> +    AVCodecContext*  avctx;                         ///< codec context for av_log
> +    DSPContext       dsp;                           ///< accelerated DSP functions
> +    uint8_t          frame_data[MAX_FRAMESIZE +
> +                      FF_INPUT_BUFFER_PADDING_SIZE];///< compressed frame data
> +    MDCTContext      mdct_ctx[WMAPRO_BLOCK_SIZES];  ///< MDCT context per block size
> +    DECLARE_ALIGNED_16(float, tmp[WMAPRO_BLOCK_MAX_SIZE]); ///< IMDCT output buffer
> +    float*           windows[WMAPRO_BLOCK_SIZES];   ///< windows for the different block sizes
> +

> +    /* frame size dependent frame information (set during initialization) */
> +    uint8_t          lossless;                      ///< lossless mode

never set or did i miss it?

> +    uint32_t         decode_flags;                  ///< used compression features
> +    uint8_t          len_prefix;                    ///< frame is prefixed with its length
> +    uint8_t          dynamic_range_compression;     ///< frame contains DRC data
> +    uint8_t          bits_per_sample;               ///< integer audio sample size for the unscaled IMDCT output (used to scale to [-1.0, 1.0])
> +    uint16_t         samples_per_frame;             ///< number of samples to output
> +    uint16_t         log2_frame_size;
> +    int8_t           num_channels;                  ///< number of channels in the stream (same as AVCodecContext.num_channels)
> +    int8_t           lfe_channel;                   ///< lfe channel index
> +    uint8_t          max_num_subframes;

> +    int8_t           num_possible_block_sizes;      ///< number of distinct block sizes that can be found in the file

i think the doxy is poor because it could also apply to 4 = [1,2,16,32]

[...]
> +
> +/**
> + *@brief Decode how the data in the frame is split into subframes.
> + *       Every WMA frame contains the encoded data for a fixed number of
> + *       samples per channel. The data for every channel might be split
> + *       into several subframes. This function will reconstruct the list of
> + *       subframes for every channel.
> + *
> + *       If the subframes are not evenly split, the algorithm estimates the
> + *       channels with the lowest number of total samples.
> + *       Afterwards, for each of these channels a bit is read from the
> + *       bitstream that indicates if the channel contains a subframe with the
> + *       next subframe size that is going to be read from the bitstream or not.
> + *       If a channel contains such a subframe, the subframe size gets added to
> + *       the channel's subframe list.
> + *       The algorithm repeats these steps until the frame is properly divided
> + *       between the individual channels.
> + *
> + *@param s context
> + *@return 0 on success, < 0 in case of an error
> + */
> +static int decode_tilehdr(WMA3DecodeContext *s)
> +{
> +    int c;
> +    uint16_t num_samples[WMAPRO_MAX_CHANNELS];
> +
> +    /* Should never consume more than 3073 bits (256 iterations for the
> +     * while loop when always the minimum amount of 128 samples is substracted
> +     * from missing samples in the 8 channel case).
> +     * 1 + BLOCK_MAX_SIZE * MAX_CHANNELS / BLOCK_MIN_SIZE * (MAX_CHANNELS  + 4)
> +     */
> +
> +    /** reset tiling information */
> +    for (c = 0; c < s->num_channels; c++)
> +        s->channel[c].num_subframes = 0;
> +
> +    memset(num_samples, 0, sizeof(num_samples));
> +
> +    /** handle the easy case with one constant-sized subframe per channel */
> +    if (s->max_num_subframes == 1) {
> +        for (c = 0; c < s->num_channels; c++) {
> +            s->channel[c].num_subframes = 1;
> +            s->channel[c].subframe_len[0] = s->samples_per_frame;
> +        }
> +    } else { /** subframe length and number of subframes is not constant */
> +        int missing_samples = s->num_channels * s->samples_per_frame;
> +        int subframe_len_bits = 0;     /** bits needed for the subframe length */
> +        int subframe_len_zero_bit = 0; /** first bit indicates if length is zero */
> +        int fixed_channel_layout;      /** all channels have the same subframe layout */
> +
> +        fixed_channel_layout = get_bits1(&s->gb);
> +

> +        /** calculate subframe len bits */
> +        if (s->lossless) {
> +            subframe_len_bits = av_log2(s->max_num_subframes - 1) + 1;
> +        } else {
> +            if (s->max_num_subframes == 16)
> +                subframe_len_zero_bit = 1;
> +            subframe_len_bits = av_log2(av_log2(s->max_num_subframes)) + 1;
> +        }

this maybe can be moved to decode_init()

> +
> +        /** loop until the frame data is split between the subframes */
> +        while (missing_samples > 0) {
> +            unsigned int channel_mask = 0;
> +            int min_channel_len;
> +            int channels_for_cur_subframe = 0;
> +            int subframe_len;
> +            /** minimum number of samples that need to be read */
> +            int min_samples = s->min_samples_per_subframe;
> +

> +            if (fixed_channel_layout) {
> +                channels_for_cur_subframe = s->num_channels;
> +                min_channel_len = num_samples[0];
> +            } else {
> +                min_channel_len = s->samples_per_frame;
> +                /** find channels with the smallest overall length */
> +                for (c = 0; c < s->num_channels; c++) {
> +                    if (num_samples[c] <= min_channel_len) {
> +                        if (num_samples[c] < min_channel_len) {
> +                            channels_for_cur_subframe = 0;
> +                            min_channel_len = num_samples[c];
> +                        }
> +                        ++channels_for_cur_subframe;
> +                    }
> +                }
> +            }

is the fixed_channel_layout special case needed?
also cant the initial min_channel_len be INT_MAX ?

> +            min_samples *= channels_for_cur_subframe;
> +
> +            /** For every channel with the minimum length, 1 bit
> +                might be transmitted that informs us if the channel
> +                contains a subframe with the next subframe_len. */
> +            if (fixed_channel_layout || channels_for_cur_subframe == 1 ||
> +                                             min_samples == missing_samples) {
> +                channel_mask = -1;
> +            } else {
> +                channel_mask = get_bits(&s->gb, channels_for_cur_subframe);
> +                if (!channel_mask) {
> +                    av_log(s->avctx, AV_LOG_ERROR,
> +                        "broken frame: zero frames for subframe_len\n");
> +                    return AVERROR_INVALIDDATA;
> +                }
> +            }
> +
> +            /** if we have the choice get next subframe length from the
> +                bitstream */
> +            if (min_samples != missing_samples) {

> +                int log2_subframe_len = 0;
> +                /* 1 bit indicates if the subframe is of maximum length */
> +                if (subframe_len_zero_bit) {
> +                    if (get_bits1(&s->gb)) {
> +                        log2_subframe_len = 1 +
> +                            get_bits(&s->gb, subframe_len_bits-1);
> +                    }
> +                } else
> +                    log2_subframe_len = get_bits(&s->gb, subframe_len_bits);

> +
> +                if (s->lossless) {
> +                    subframe_len =
> +                        s->samples_per_frame / s->max_num_subframes;
> +                    subframe_len *= log2_subframe_len + 1;
> +                } else {
> +                    subframe_len =
> +                        s->samples_per_frame / (1 << log2_subframe_len);
> +                }

one of these multiplies by log2_subframe_len the other divides, is that
intended?

[...]

> @@ -113,11 +627,11 @@
>      int i;
>      int offset = 0;
>      int8_t rotation_offset[WMAPRO_MAX_CHANNELS * WMAPRO_MAX_CHANNELS];
> -    memset(chgroup->decorrelation_matrix,0,
> +    memset(chgroup->decorrelation_matrix, 0,
>             sizeof(float) *s->num_channels * s->num_channels);
>  
>      for (i = 0; i < chgroup->num_channels * (chgroup->num_channels - 1) >> 1; i++)
> -        rotation_offset[i] = get_bits(&s->gb,6);
> +        rotation_offset[i] = get_bits(&s->gb, 6);
>  
>      for (i = 0; i < chgroup->num_channels; i++)
>          chgroup->decorrelation_matrix[chgroup->num_channels * i + i] =
> @@ -127,7 +641,7 @@
>          int x;
>          for (x = 0; x < i; x++) {
>              int y;
> -            for (y = 0; y < i + 1 ; y++) {
> +            for (y = 0; y < i + 1; y++) {
>                  float v1 = chgroup->decorrelation_matrix[x * chgroup->num_channels + y];
>                  float v2 = chgroup->decorrelation_matrix[i * chgroup->num_channels + y];
>                  int n = rotation_offset[offset + x];

ok and please just commit such changes

> @@ -153,42 +667,317 @@
>  }
>  
>  /**
> - *@brief Reconstruct the individual channel data.
> + *@brief Decode channel transformation parameters
>   *@param s codec context
> + *@return 0 in case of success, < 0 in case of bitstream errors
>   */
> -static void inverse_channel_transform(WMA3DecodeContext *s)
> +static int decode_channel_transform(WMA3DecodeContext* s)
>  {
>      int i;
> +    /* should never consume more than 1921 bits for the 8 channel case
> +     * 1 + MAX_CHANNELS * ( MAX_CHANNELS + 2 + 3 * MAX_CHANNELS * MAX_CHANNELS
> +     * + MAX_CHANNELS + MAX_BANDS + 1)
> +     */
>  
> -    for (i = 0; i < s->num_chgroups; i++) {
> +    /** in the one channel case channel transforms are pointless */
> +    s->num_chgroups = 0;
> +    if (s->num_channels > 1) {
> +        int remaining_channels = s->channels_for_cur_subframe;
>  
> -        if (s->chgroup[i].transform == 1) {
> -            /** M/S stereo decoding */
> -            int16_t* sfb_offsets = s->cur_sfb_offsets;
> -            float* ch0 = *sfb_offsets + s->channel[0].coeffs;
> -            float* ch1 = *sfb_offsets++ + s->channel[1].coeffs;
> -            const char* tb = s->chgroup[i].transform_band;
> -            const char* tb_end = tb + s->num_bands;
> +        if (get_bits1(&s->gb)) {
> +            av_log_ask_for_sample(s->avctx,
> +                "unsupported channel transform bit\n");
> +            return AVERROR_INVALIDDATA;
> +        }
>  
> -            while (tb < tb_end) {
> -                const float* ch0_end = s->channel[0].coeffs +
> -                                       FFMIN(*sfb_offsets,s->subframe_len);
> -                if (*tb++ == 1) {
> -                    while (ch0 < ch0_end) {
> -                        const float v1 = *ch0;
> -                        const float v2 = *ch1;
> -                        *ch0++ = v1 - v2;
> -                        *ch1++ = v1 + v2;
> +        for (s->num_chgroups = 0; remaining_channels &&
> +            s->num_chgroups < s->channels_for_cur_subframe; s->num_chgroups++) {
> +            WMA3ChannelGroup* chgroup = &s->chgroup[s->num_chgroups];
> +            float** channel_data = chgroup->channel_data;
> +            chgroup->num_channels = 0;
> +            chgroup->transform = 0;
> +
> +            /** decode channel mask */
> +            if (remaining_channels > 2) {
> +                for (i = 0; i < s->channels_for_cur_subframe; i++) {
> +                    int channel_idx = s->channel_indexes_for_cur_subframe[i];
> +                    if (!s->channel[channel_idx].grouped
> +                        && get_bits1(&s->gb)) {
> +                        ++chgroup->num_channels;
> +                        s->channel[channel_idx].grouped = 1;
> +                        *channel_data++ = s->channel[channel_idx].coeffs;
>                      }
> +                }
> +            } else {
> +                chgroup->num_channels = remaining_channels;
> +                for (i = 0; i < s->channels_for_cur_subframe; i++) {
> +                    int channel_idx = s->channel_indexes_for_cur_subframe[i];
> +                    if (!s->channel[channel_idx].grouped)
> +                        *channel_data++ = s->channel[channel_idx].coeffs;
> +                    s->channel[channel_idx].grouped = 1;
> +                }
> +            }
> +
> +            /** decode transform type */
> +            if (chgroup->num_channels == 2) {
> +                if (get_bits1(&s->gb)) {
> +                    if (get_bits1(&s->gb)) {
> +                        av_log_ask_for_sample(s->avctx,
> +                               "unsupported channel transform type\n");
> +                    }
>                  } else {
> -                    while (ch0 < ch0_end) {
> -                        *ch0++ *= 181.0 / 128;
> -                        *ch1++ *= 181.0 / 128;
> +                    chgroup->transform = 1;
> +                    if (s->num_channels == 2) {
> +                        chgroup->decorrelation_matrix[0] =  1.0;
> +                        chgroup->decorrelation_matrix[1] = -1.0;
> +                        chgroup->decorrelation_matrix[2] =  1.0;
> +                        chgroup->decorrelation_matrix[3] =  1.0;
> +                    } else {
> +                        /** cos(pi/4) */
> +                        chgroup->decorrelation_matrix[0] =  0.70703125;
> +                        chgroup->decorrelation_matrix[1] = -0.70703125;
> +                        chgroup->decorrelation_matrix[2] =  0.70703125;
> +                        chgroup->decorrelation_matrix[3] =  0.70703125;
>                      }
>                  }
> -                ++sfb_offsets;
> +            } else if (chgroup->num_channels > 2) {
> +                if (get_bits1(&s->gb)) {
> +                    chgroup->transform = 1;
> +                    if (get_bits1(&s->gb)) {
> +                        decode_decorrelation_matrix(s, chgroup);
> +                    } else {
> +                        /** FIXME: more than 6 coupled channels not supported */
> +                        if (chgroup->num_channels > 6) {
> +                            av_log_ask_for_sample(s->avctx,
> +                                   "coupled channels > 6\n");
> +                        } else {
> +                            memcpy(chgroup->decorrelation_matrix,
> +                              default_decorrelation[chgroup->num_channels],
> +                              sizeof(float) * chgroup->num_channels *
> +                              chgroup->num_channels);
> +                        }
> +                    }
> +                }
>              }
> -        } else if (s->chgroup[i].transform) {
> +
> +            /** decode transform on / off */
> +            if (chgroup->transform) {
> +                if (!get_bits1(&s->gb)) {
> +                    int i;
> +                    /** transform can be enabled for individual bands */
> +                    for (i = 0; i < s->num_bands; i++) {
> +                        chgroup->transform_band[i] = get_bits1(&s->gb);
> +                    }
> +                } else {
> +                    memset(chgroup->transform_band, 1, s->num_bands);
> +                }
> +            }
> +            remaining_channels -= chgroup->num_channels;
> +        }
> +    }
> +    return 0;
> +}

whatever happened here its not revieable

> +
> +/**
> + *@brief Extract the coefficients from the bitstream.
> + *@param s codec context
> + *@param c current channel number
> + *@return 0 on success, < 0 in case of bitstream errors
> + */
> +static int decode_coeffs(WMA3DecodeContext *s, int c)
> +{
> +    int vlctable;
> +    VLC* vlc;
> +    WMA3ChannelCtx* ci = &s->channel[c];
> +    int rl_mode = 0;
> +    int cur_coeff = 0;
> +    int num_zeros = 0;
> +    const uint16_t* run;
> +    const uint16_t* level;
> +
> +    dprintf(s->avctx, "decode coefficients for channel %i\n", c);
> +
> +    vlctable = get_bits1(&s->gb);
> +    vlc = &coef_vlc[vlctable];
> +
> +    if (vlctable) {
> +        run = coef1_run;
> +        level = coef1_level;
> +    } else {
> +        run = coef0_run;
> +        level = coef0_level;
> +    }
> +
> +    /** decode vector coefficients (consumes up to 167 bits per iteration for
> +      4 vector coded large values) */
> +    while (!rl_mode && cur_coeff + 3 < s->subframe_len) {
> +        int vals[4];
> +        int i;
> +        unsigned int idx;
> +
> +        idx = get_vlc2(&s->gb, vec4_vlc.table, VLCBITS, VEC4MAXDEPTH);
> +
> +        if ( idx == HUFF_VEC4_SIZE - 1 ) {
> +            for (i = 0 ; i < 4 ; i += 2) {
> +                idx = get_vlc2(&s->gb, vec2_vlc.table, VLCBITS, VEC2MAXDEPTH);
> +                if ( idx == HUFF_VEC2_SIZE - 1 ) {
> +                    vals[i] = get_vlc2(&s->gb, vec1_vlc.table, VLCBITS, VEC1MAXDEPTH);
> +                    if (vals[i] == HUFF_VEC1_SIZE - 1)
> +                        vals[i] += ff_wma_get_large_val(&s->gb);
> +                    vals[i+1] = get_vlc2(&s->gb, vec1_vlc.table, VLCBITS, VEC1MAXDEPTH);
> +                    if (vals[i+1] == HUFF_VEC1_SIZE - 1)
> +                        vals[i+1] += ff_wma_get_large_val(&s->gb);
> +                } else {
> +                    vals[i]   = symbol_to_vec2[idx] >> 4;
> +                    vals[i+1] = symbol_to_vec2[idx] & 0xF;
> +                }
> +            }
> +        } else {
> +             vals[0] =  symbol_to_vec4[idx] >> 12;
> +             vals[1] = (symbol_to_vec4[idx] >> 8) & 0xF;
> +             vals[2] = (symbol_to_vec4[idx] >> 4) & 0xF;
> +             vals[3] =  symbol_to_vec4[idx]       & 0xF;
> +        }
> +
> +        /** decode sign */
> +        for (i = 0; i < 4; i++) {
> +            if (vals[i]) {
> +                int sign = get_bits1(&s->gb) - 1;
> +                ci->coeffs[cur_coeff] = (vals[i]^sign) - sign;
> +                num_zeros = 0;
> +            } else {
> +                /** switch to run level mode when subframe_len / 128 zeros
> +                   were found in a row */
> +                rl_mode |= (++num_zeros > s->subframe_len>>8);
> +            }
> +            ++cur_coeff;
> +        }
> +    }
> +
> +    /** decode run level coded coefficients */
> +    if (rl_mode) {
> +        if(ff_wma_run_level_decode(s->avctx, &s->gb, vlc,
> +                             level, run, 1, ci->coeffs,
> +                             cur_coeff, s->subframe_len, s->subframe_len,
> +                             s->esc_len, 0))
> +            return AVERROR_INVALIDDATA;
> +    }
> +
> +    return 0;
> +}
> +

ok

> +/**
> + *@brief Extract scale factors from the bitstream.
> + *@param s codec context
> + *@return 0 on success, < 0 in case of bitstream errors
> + */
> +static int decode_scale_factors(WMA3DecodeContext* s)
> +{
> +    int i;
> +
> +    /** should never consume more than 5344 bits
> +     *  MAX_CHANNELS * (1 +  MAX_BANDS * 23)
> +     */
> +
> +    for (i = 0; i < s->channels_for_cur_subframe; i++) {
> +        int c = s->channel_indexes_for_cur_subframe[i];
> +        int* sf;
> +        int* sf_end = s->channel[c].scale_factors + s->num_bands;
> +
> +        /** resample scale factors for the new block size */
> +        if (s->channel[c].reuse_sf) {
> +            const int blocks_per_frame = s->samples_per_frame/s->subframe_len;
> +            const int res_blocks_per_frame = s->samples_per_frame /
> +                                          s->channel[c].scale_factor_block_len;

> +            const int idx0 = av_log2(blocks_per_frame);
> +            const int idx1 = av_log2(res_blocks_per_frame);

i assume these are exact 2^x values in which case maybe storing them as
log2 values might be simpler?

> +            const int8_t* sf_offsets = s->sf_offsets[idx0][idx1];
> +            int b;
> +            for (b = 0; b < s->num_bands; b++)
> +                s->channel[c].scale_factors[b] =
> +                                   s->channel[c].saved_scale_factors[*sf_offsets++];
> +        }
> +

> +        if (s->channel[c].cur_subframe > 0) {
> +            s->channel[c].transmit_sf = get_bits1(&s->gb);
> +        } else
> +            s->channel[c].transmit_sf = 1;
> +
> +        if (s->channel[c].transmit_sf) {

can transmit_sf be a local var here? theres no other use in this patch

> +
> +            if (!s->channel[c].reuse_sf) {
> +                int val;
> +                /** decode DPCM coded scale factors */
> +                s->channel[c].scale_factor_step = get_bits(&s->gb, 2) + 1;
> +                val = 45 / s->channel[c].scale_factor_step;
> +                for (sf = s->channel[c].scale_factors; sf < sf_end; sf++) {
> +                    val += get_vlc2(&s->gb, sf_vlc.table, SCALEVLCBITS, SCALEMAXDEPTH) - 60;
> +                    *sf = val;
> +                }
> +            } else {
> +                int i;
> +                /** run level decode differences to the resampled factors */
> +                for (i = 0; i < s->num_bands; i++) {
> +                    int idx;
> +                    int skip;
> +                    int val;
> +                    int sign;
> +
> +                    idx = get_vlc2(&s->gb, sf_rl_vlc.table, VLCBITS, SCALERLMAXDEPTH);
> +
> +                    if ( !idx ) {
> +                        uint32_t code = get_bits(&s->gb, 14);
> +                        val  =  code >> 6;
> +                        sign = (code & 1) - 1;
> +                        skip = (code & 0x3f) >> 1;
> +                    } else if (idx == 1) {
> +                        break;
> +                    } else {
> +                        skip = scale_rl_run[idx];
> +                        val  = scale_rl_level[idx];
> +                        sign = get_bits1(&s->gb)-1;
> +                    }
> +
> +                    i += skip;
> +                    if (i >= s->num_bands) {
> +                        av_log(s->avctx,AV_LOG_ERROR,
> +                               "invalid scale factor coding\n");
> +                        return AVERROR_INVALIDDATA;
> +                    }
> +                    s->channel[c].scale_factors[i] += (val ^ sign) - sign;
> +                }
> +            }
> +

> +            /** save transmitted scale factors so that they can be reused for
> +                the next subframe */
> +            memcpy(s->channel[c].saved_scale_factors,
> +                   s->channel[c].scale_factors,
> +                   sizeof(int) * s->num_bands);

sizeof(*s->channel[c].saved_scale_factors)

> +            s->channel[c].scale_factor_block_len = s->subframe_len;
> +            s->channel[c].reuse_sf               = 1;
> +        }
> +
> +        /** calculate new scale factor maximum */
> +        s->channel[c].max_scale_factor = s->channel[c].scale_factors[0];
> +        for (sf = s->channel[c].scale_factors + 1; sf < sf_end; sf++) {

> +            if (s->channel[c].max_scale_factor < *sf)
> +                s->channel[c].max_scale_factor = *sf;

FFMAX

[...]
> @@ -199,8 +988,8 @@
>              /** multichannel decorrelation */
>              for (sfb = s->cur_sfb_offsets ;
>                  sfb < s->cur_sfb_offsets + s->num_bands;sfb++) {
> +                int y;
>                  if (*tb++ == 1) {
> -                    int y;
>                      /** multiply values with the decorrelation_matrix */
>                      for (y = sfb[0]; y < FFMIN(sfb[1], s->subframe_len); y++) {
>                          const float* mat = s->chgroup[i].decorrelation_matrix;
> @@ -208,7 +997,7 @@
>                          float* data_ptr = data;
>                          float** ch;
>  
> -                        for (ch = ch_data;ch < ch_end; ch++)
> +                        for (ch = ch_data; ch < ch_end; ch++)
>                             *data_ptr++ = (*ch)[y];
>  
>                          for (ch = ch_data; ch < ch_end; ch++) {
> @@ -220,9 +1009,593 @@
>                              (*ch)[y] = sum;
>                          }
>                      }
> +                } else if (s->num_channels == 2) {
> +                    for (y = sfb[0]; y < FFMIN(sfb[1], s->subframe_len); y++) {
> +                        ch_data[0][y] *= 181.0 / 128;
> +                        ch_data[1][y] *= 181.0 / 128;
> +                    }
>                  }
>              }
>          }
>      }
>  }

ok

[...]
> +        memset(s->channel[c].coeffs, 0, sizeof(float) * subframe_len);

sizeof(*s->channel[c].coeffs)

[...]
> +/**
> + *@brief Decode one WMA frame.
> + *@param s codec context
> + *@return 0 if the trailer bit indicates that this is the last frame,
> + *        1 if there are additional frames
> + */
> +static int decode_frame(WMA3DecodeContext *s)
> +{
> +    GetBitContext* gb = &s->gb;
> +    int more_frames = 0;
> +    int len = 0;
> +    int i;
> +
> +    /** check for potential output buffer overflow */
> +    if (s->samples + s->num_channels * s->samples_per_frame > s->samples_end) {

can overflow
s->samples_end - s->samples < ...
should be better

> +        av_log(s->avctx,AV_LOG_ERROR,
> +               "not enough space for the output samples\n");
> +        s->packet_loss = 1;
> +        return 0;
> +    }
> +
> +    /** get frame length */
> +    if (s->len_prefix)
> +        len = get_bits(gb, s->log2_frame_size);
> +
> +    dprintf(s->avctx, "decoding frame with length %x\n", len);
> +
> +    /** decode tile information */
> +    if (decode_tilehdr(s)) {
> +        s->packet_loss = 1;
> +        return 0;
> +    }
> +
> +    /** read postproc transform */
> +    if (s->num_channels > 1 && get_bits1(gb)) {
> +        av_log_ask_for_sample(s->avctx, "Unsupported postproc transform found\n");
> +        s->packet_loss = 1;
> +        return 0;
> +    }
> +
> +    /** read drc info */
> +    if (s->dynamic_range_compression) {
> +        s->drc_gain = get_bits(gb, 8);
> +        dprintf(s->avctx, "drc_gain %i\n", s->drc_gain);
> +    }
> +
> +    /** no idea what these are for, might be the number of samples
> +        that need to be skipped at the beginning or end of a stream */
> +    if (get_bits1(gb)) {
> +        int skip;
> +
> +        /** usually true for the first frame */
> +        if (get_bits1(gb)) {
> +            skip = get_bits(gb, av_log2(s->samples_per_frame * 2));
> +            dprintf(s->avctx, "start skip: %i\n", skip);
> +        }
> +
> +        /** sometimes true for the last frame */
> +        if (get_bits1(gb)) {
> +            skip = get_bits(gb, av_log2(s->samples_per_frame * 2));
> +            dprintf(s->avctx, "end skip: %i\n", skip);
> +        }
> +
> +    }
> +
> +    dprintf(s->avctx, "BITSTREAM: frame header length was %i\n",
> +           get_bits_count(gb) - s->frame_offset);
> +
> +    /** reset subframe states */
> +    s->parsed_all_subframes = 0;
> +    for (i = 0; i < s->num_channels; i++) {
> +        s->channel[i].decoded_samples = 0;
> +        s->channel[i].cur_subframe    = 0;
> +        s->channel[i].reuse_sf        = 0;
> +    }
> +

> +    /** decode all subframes */
> +    while (!s->parsed_all_subframes) {
> +        if (decode_subframe(s) < 0) {
> +            s->packet_loss = 1;
> +            return 0;
> +        }
> +    }

are all the previous frames to a failed frames used or droped?
i think droping all might not be wise but maybe the last few should
be droped (depends on which works best)

> +
> +    /** interleave samples and write them to the output buffer */
> +    for (i = 0; i < s->num_channels; i++) {
> +        float* ptr;
> +        int incr = s->num_channels;
> +        float* iptr = s->channel[i].out;
> +        int x;
> +
> +        ptr = s->samples + i;
> +
> +        for (x = 0; x < s->samples_per_frame; x++) {
> +            *ptr = av_clipf(*iptr++, -1.0, 32767.0 / 32768.0);
> +            ptr += incr;
> +        }
> +

> +        /** reuse second half of the IMDCT output for the next frame */
> +        memmove(&s->channel[i].out[0],
> +                &s->channel[i].out[s->samples_per_frame],
> +                s->samples_per_frame * sizeof(float));

doesnt look like it needs a move besides are you sure that cannot be avoided?

[...]
> +/**
> + *@brief Fill the bit reservoir with a (partial) frame.
> + *@param s codec context
> + *@param gb bitstream reader context
> + *@param len length of the partial frame
> + *@param append decides wether to reset the buffer or not
> + */
> +static void save_bits(WMA3DecodeContext *s, GetBitContext* gb, int len,
> +                          int append)
> +{
> +    int buflen;
> +    int bit_offset;
> +    int pos;
> +
> +    if (!append) {
> +        s->frame_offset = get_bits_count(gb) & 7;
> +        s->num_saved_bits = s->frame_offset;
> +    }
> +
> +    buflen = (s->num_saved_bits + len + 8) >> 3;
> +
> +    if (len <= 0 || buflen > MAX_FRAMESIZE) {
> +         av_log_ask_for_sample(s->avctx, "input buffer too small\n");
> +         s->packet_loss = 1;
> +         return;
> +    }
> +
> +    if (!append) {
> +        s->num_saved_bits += len;
> +        memcpy(s->frame_data, gb->buffer + (get_bits_count(gb) >> 3),
> +              (s->num_saved_bits  + 8)>> 3);
> +        skip_bits_long(gb, len);
> +    } else {
> +        bit_offset = s->num_saved_bits & 7;
> +        pos = (s->num_saved_bits - bit_offset) >> 3;
> +
> +        s->num_saved_bits += len;
> +
> +        /** byte align prev_frame buffer */
> +        if (bit_offset) {
> +            int missing = 8 - bit_offset;
> +            missing = FFMIN(len, missing);
> +            s->frame_data[pos++] |=
> +                get_bits(gb, missing) << (8 - bit_offset - missing);
> +            len -= missing;
> +        }
> +
> +        /** copy full bytes */
> +        while (len > 7) {
> +            s->frame_data[pos++] = get_bits(gb, 8);
> +            len -= 8;
> +        }
> +
> +        /** copy remaining bits */
> +        if (len > 0)
> +            s->frame_data[pos++] = get_bits(gb, len) << (8 - len);
> +
> +    }
> +
> +    init_get_bits(&s->gb, s->frame_data, s->num_saved_bits);
> +    skip_bits(&s->gb, s->frame_offset);

ff_copy_bits()
also you could keep a PutBitContext instead of dealing with remaining %8
bits
and maybe some frames can be decoded without copying them

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Complexity theory is the science of finding the exact solution to an
approximation. Benchmarking OTOH is finding an approximation of the exact
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090823/1fd4f93c/attachment.pgp>