[FFmpeg-devel] [PATCH] MLP Encoder

Thu Aug 21 06:13:32 CEST 2008

Hello,

A lot has changed since last review. Many revision touched many
functions (like buffering data). So I did not commit any parts (there
might be a couple of OKed hunks that haven't changed though). There
are a couple of TODOs in the code that might require big changes again
(like not memcpy'ing params in mlp_encode_frame() and not copying
samples to a temporary buffer for set_filter_params()).

I have not tried IIR filters again.

And I got stuck in the lossless matrix in a point that even if it
might be really really easy, my head just blocks when it reads or
hears the word "matrix". There is some working code, but changing one
line or another, or even when I try to add something simple,
everything goes horribly wrong and I'm hopeless at debugging at the
moment (even if it might be easy). I think it might even be better to
just disable the code that's currently there since it is very dumb. It
saves ~40% in 440hz.wav, since both channels are identical. It saves
~2.5% in luckynight.mlp. But I imagine it will just bloat any sample
that doesn't share anything between both channels.

Now for Good News:
With luckynight.mlp
decoded  wav: 10668844
original mlp:  7899056  25.96%
encoded  mlp:  7931348  25.66%

This is with no IIR and the dumb decorrelation.

On Thu, Aug 14, 2008 at 2:38 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Thu, Aug 14, 2008 at 02:08:45AM -0300, Ramiro Polla wrote:
>> Hello,
>>
>> Attached is the MLP encoder written as part of the Google Summer of
>> Code project and mentored by Justin Ruggles.
>>
>> Things that are not quite complete:
>> - 24-bit support. I'm waiting to see what will become of
>> SAMPLE_FMT_S24, and the code might need a few changes. But mostly it
>> works;
>> - The filters... The filtering code is there and it works, but finding
>> good coefficients is not. The FIR code from flacenc.c can be plugged
>> in there quite easily, but it takes more bits to encode the
>> coefficients than the raw data. It might be because of the small frame
>> size (40 samples @ 44100kHz). If the frame size was raised to
>> something like 256 samples, the compression would be much better, but
>> the RE work shows that the bitstream doesn't allow this. Also my
>> attempts with IIR filters weren't successful.
>>
>> Things that are awfully suboptimal:
>> - The search for the best offset for the codebooks. If just one offset
>> (the average) is taken into account and no search is done, encoding is
>> 30% faster and the resulting filesize is 0.5% bigger. At one point I
>> thought taking the median (instead of the average) of all the samples
>> would compress better (more values close to 0), but it took longer to
>> calculate and would have highly inaccurate offsets, having to search
>> even more.
>>
>> Things that have not been implemented:
>> - The lossless matrix. I don't know what to do of it. So there is no
>> channel decorrelation.
>> - More than 2 channels. This shouldn't be too hard, I just didn't put
>> it high on my priorities.
>>
>> Things that can be reused from existing code:
>> - A pending patch for END_OF_STREAM in mlp.h.
>> - More structs can be reused from mlpdec.c. I would prefer to split
>> Substr into DecodingParams and RestartHeader (like the encoder does),
>> and move that to mlp.h. I'll send a patch about this depending on the
>> review.
>> - The noise code can be reused if the noise generation functions are
>> cleaned up to use shared structs or more generic parameters. This code
>> is still not being needed by the encoder though, since it does no
>> rematrixing.
>>
>> There's also another patch pending to simplify quant_step_size in mlpdec.c.
>>
>> Example of file sizes and compression using sample "luckynight.mlp":
>> decoded wav: 10668844
>> original mlp: 7899056  25.9%
>> encoded  mlp: 8563354  19.7%
>>
>> Besides this, the bitstream is very robust. You can trash it at will
>> and the decoder will play as much as possible, resuming lossless
>> playback usually in less than 70ms (for 44100kHz, with 40 samples per
>> frame and 16 frames per major sync).
>>
>> Please review.
> [...]
>
>
>> /*
>>  * MLP encoder
>>  * Copyright (c) 2008 Ramiro Polla <ramiro at lisha.ufsc.br>
>>  *
>>  * This file is part of FFmpeg.
>>  *
>>  * FFmpeg is free software; you can redistribute it and/or
>>  * modify it under the terms of the GNU Lesser General Public
>>  * License as published by the Free Software Foundation; either
>>  * version 2.1 of the License, or (at your option) any later version.
>>  *
>>  * FFmpeg is distributed in the hope that it will be useful,
>>  * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>  * Lesser General Public License for more details.
>>  *
>>  * You should have received a copy of the GNU Lesser General Public
>>  * License along with FFmpeg; if not, write to the Free Software
>>  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>>  */
>>
>> #include "avcodec.h"
>> #include "bitstream.h"
>> #include "libavutil/crc.h"
>> #include "mlp.h"
>>
>> #define MAJOR_HEADER_INTERVAL 16
>
> ok (and yes please commit the stuff between ok and the previous non '>')

As I've said, I haven't committed any parts yet, since there is very
little that hasn't changed from the review.

>>
>> typedef struct {
>>     //! The index of the first channel coded in this substream.
>>     uint8_t         min_channel;
>>     //! The index of the last channel coded in this substream.
>>     uint8_t         max_channel;
>>     //! The number of channels input into the rematrix stage.
>>     uint8_t         max_matrix_channel;
>>
>>     //! The left shift applied to random noise in 0x31ea substreams.
>>     uint8_t         noise_shift;
>>     //! The current seed value for the pseudorandom noise generator(s).
>>     uint32_t        noisegen_seed;
>>
>>     //! Set if the substream contains extra info to check the size of VLC blocks.
>>     int             data_check_present;
>>
>>     //! Running XOR of all output samples.
>>     int32_t         lossless_check_data;
>> } RestartHeader;
>
> please place comments to the right of fields or leave a space before each
> comment and the previous unrelated field

Done.

>>
>> typedef struct {
>>     //! number of PCM samples in current audio block
>>     uint16_t        blocksize;
>
> i do not think this needs to be uint16

For some reason changing to unsigned int was bad for compression. I
haven't debugged why much longer.

> [...]
>> #define BITS_16         0x0
>> #define BITS_20         0x1
>> #define BITS_24         0x2
>
> enum

Done.

>> /** Returns the coded sample_rate for MLP. */
>> static int mlp_sample_rate(int sample_rate)
>> {
>>     switch (sample_rate) {
>>     case 44100 << 0: return 0x8 + 0;
>>     case 44100 << 1: return 0x8 + 1;
>>     case 44100 << 2: return 0x8 + 2;
>>     case 48000 << 0: return 0x0 + 0;
>>     case 48000 << 1: return 0x0 + 1;
>>     case 48000 << 2: return 0x0 + 2;
>>     default:
>>         return -1;
>>     }
>> }
>
> ok
>
>
>>
>> /** Writes a major sync header to the bitstream. */
>> static void write_major_sync(MLPEncodeContext *ctx, uint8_t *buf, int buf_size)
>> {
>>     PutBitContext pb;
>>
>>     init_put_bits(&pb, buf, buf_size);
>>
>>     put_bits(&pb, 24, SYNC_MAJOR       );
>>     put_bits(&pb,  8, SYNC_MLP         );
>
> ok
>
>
>>     put_bits(&pb,  4, ctx->sample_fmt  );
>>     put_bits(&pb,  4, ctx->sample_fmt  );
>
>>     put_bits(&pb,  4, ctx->sample_rate );
>
>>     put_bits(&pb,  4, ctx->sample_rate );
>
> as it is not the sample rate but some kind of sample_rate_code it should be
> named accordingly

Changed.

>>     put_bits(&pb, 11, 0                );
>
> and that is what?

Added a comment.

>>     put_bits(&pb,  5, ctx->mlp_channels);
>>
>>     /* These values seem to be constant for all MLP samples tested. */
>>     put_bits(&pb, 16, 0xb752);
>>     put_bits(&pb, 16, 0x4000);
>>     put_bits(&pb, 16, 0x0000);
>>
>>     put_bits(&pb,  1, 1); /* This value is 1 in all MLP samples tested.
>>                            * I suppose it would be 0 only when no filters
>>                            * or codebooks are used. */
>
> ok
>
>
>>     put_bits(&pb, 15, 0); /* TODO peak_bitrate: most MLP samples tested encode
>>                            * a value that evaluates peak_bitrate to 9600000 or
>>                            * a little bit less. */
>
>
>>     put_bits(&pb,  4, 1); /* TODO Support more num_substreams. */
>
> ok, we dont need substreams anytime soon IMHO
>
>
>>
>>     put_bits(&pb,  4, 0x1       ); /* TODO These values have something */
>>     put_bits(&pb, 16, 0x054c    ); /* to do with the sample rate       */
>>     put_bits(&pb,  8, ctx->mlp_channels2);
>
>>     put_bits(&pb, 32, 0x00008080); /* These values seem */
>>     put_bits(&pb,  8, 0x00      ); /* to be constants   */
>
> likely not but ok to commit
>
>
>>     put_bits(&pb,  8, ctx->mlp_channels3); /* TODO Finish understanding this field. */
>
> and i just wanted to ask what mlp_channels2 and mlp_channels3 mean :/

I think I added some more comments around the code.

>>
>>     flush_put_bits(&pb);
>>
>>     AV_WL16(buf+26, ff_mlp_checksum16(buf, 26));
>> }
>
> ok
>
>
>>
>> /** Writes a restart header to the bitstream. Damaged streams can start being
>>  *  decoded losslessly again after such a header and the subsequent decoding
>>  *  params header.
>>  */
>> static void write_restart_header(MLPEncodeContext *ctx,
>>                                  PutBitContext *pb, int substr)
>> {
>>     RestartHeader *rh = &ctx->restart_header[substr];
>>     int32_t lossless_check = xor_32_to_8(rh->lossless_check_data);
>>     unsigned int start_count = put_bits_count(pb);
>>     PutBitContext tmpb;
>>     uint8_t checksum;
>>     unsigned int ch;
>>
>>     put_bits(pb, 14, 0x31ea                ); /* TODO 0x31eb */
>>     put_bits(pb, 16, 0                     ); /* TODO I don't know what this is. Ask Ian. */
>>     put_bits(pb,  4, rh->min_channel       );
>>     put_bits(pb,  4, rh->max_channel       );
>>     put_bits(pb,  4, rh->max_matrix_channel);
>>     put_bits(pb,  4, rh->noise_shift       );
>>     put_bits(pb, 23, rh->noisegen_seed     );
>>     put_bits(pb, 19, 0                     ); /* TODO What the hell is this? */
>>     put_bits(pb,  1, rh->data_check_present);
>>     put_bits(pb,  8, lossless_check        );
>>     put_bits(pb, 16, 0                     ); /* this is zero =) */
>>
>>     for (ch = 0; ch <= rh->max_matrix_channel; ch++)
>>         put_bits(pb, 6, ch);
>>
>>     /* data must be flushed for the checksum to be right. */
>>     tmpb = *pb;
>>     flush_put_bits(&tmpb);
>>
>>     checksum = ff_mlp_restart_checksum(pb->buf, put_bits_count(pb) - start_count);
>>
>>     put_bits(pb,  8, checksum);
>> }
>
> ok
>
>
>> /** Encodes the third type of channel information for the sync headers. */
>> static uint8_t code_channels3(int channels)
>> {
>>     switch (channels) {
>>     case 1: return 0x1f;
>>     case 2: return 0x1b;
>>     case 6: return 0x00;
>>     default:
>>         return 0x1b;
>>     }
>> }
>
> i assume you dont know anything about the values?

Your assumption is correct =)

> if so its ok otherwise
> please document it
> though name the function more sanely, get_channel3_code() for example

Renamed and commented more.

>>
>> static av_cold int mlp_encode_init(AVCodecContext *avctx)
>> {
>>     MLPEncodeContext *ctx = avctx->priv_data;
>>     unsigned int quant_step_size;
>>     unsigned int substr;
>>
>>     ctx->avctx = avctx;
>>
>>     ctx->sample_rate = mlp_sample_rate(avctx->sample_rate);
>>     if (ctx->sample_rate < 0) {
>>         av_log(avctx, AV_LOG_ERROR, "Unsupported sample rate %d. Supported "
>>                             "sample rates are 44100, 88200, 176400, 48000, "
>>                             "96000, and 192000.\n", avctx->sample_rate);
>>         return -1;
>>     }
>>
>>     /* TODO support more channels. */
>>     if (avctx->channels > 2) {
>>         av_log(avctx, AV_LOG_ERROR,
>>                "Only mono and stereo are supported at the moment.\n");
>>         return -1;
>>     }
>>
>>     switch (avctx->sample_fmt) {
>>     case SAMPLE_FMT_S16: ctx->sample_fmt = BITS_16; quant_step_size = 8; break;
>>     /* TODO 20 bits: */
>>     case SAMPLE_FMT_S24: ctx->sample_fmt = BITS_24; quant_step_size = 0; break;
>>     default:
>>         av_log(avctx, AV_LOG_ERROR, "Sample format not supported. "
>>                "Only 16- and 24-bit samples are supported.\n");
>>         return -1;
>>     }
>>
>>     avctx->frame_size               = 40 << (ctx->sample_rate & 0x7);
>>     avctx->coded_frame              = avcodec_alloc_frame();
>>     avctx->coded_frame->key_frame   = 1;
>>
>>     ff_mlp_init_crc();
>>     ff_mlp_init_crc2D(NULL);
>>
>>     /* TODO mlp_channels is more complex, but for now
>>      * we only accept mono and stereo. */
>>     ctx->mlp_channels   = avctx->channels - 1;
>>     ctx->mlp_channels2  = (1 << avctx->channels) - 1;
>>     ctx->mlp_channels3  = code_channels3(avctx->channels);
>>     ctx->num_substreams = 1;
>>
>>     for (substr = 0; substr < ctx->num_substreams; substr++) {
>>         DecodingParams *dp = &ctx->decoding_params[substr];
>>         RestartHeader  *rh = &ctx->restart_header [substr];
>>         uint8_t param_presence_flags = 0;
>>         unsigned int channel;
>>
>>         rh->min_channel        = 0;
>>         rh->max_channel        = avctx->channels - 1;
>>         rh->max_matrix_channel = 1;
>>
>>         dp->blocksize          = avctx->frame_size;
>>
>>         for (channel = 0; channel <= rh->max_channel; channel++) {
>>             ChannelParams *cp = &ctx->channel_params[channel];
>>
>>             dp->quant_step_size[channel] = quant_step_size;
>>             cp->huff_lsbs                = 24;
>>         }
>>
>>         param_presence_flags |= PARAM_BLOCKSIZE;
>> /*      param_presence_flags |= PARAM_MATRIX; */
>> /*      param_presence_flags |= PARAM_OUTSHIFT; */
>>         param_presence_flags |= PARAM_QUANTSTEP;
>>         param_presence_flags |= PARAM_FIR;
>> /*      param_presence_flags |= PARAM_IIR; */
>>         param_presence_flags |= PARAM_HUFFOFFSET;
>>
>>         dp->param_presence_flags = param_presence_flags;
>>     }
>>
>>     return 0;
>> }
>
> ok
>
>
>>
>> /** Calculates the smallest number of bits it takes to encode a given signed
>>  *  value in two's complement.
>>  */
>> static int inline number_sbits(int number)
>> {
>>     int bits = 0;
>>
>>     if      (number > 0)
>>         for (bits = 31; bits && !(number & (1<<(bits-1))); bits--);
>>     else if (number < 0)
>>         for (bits = 31; bits &&  (number & (1<<(bits-1))); bits--);
>
> isnt something with av_log2(FFABS()) faster?

Changed. But mru suggested having a clz() function because his ARM can
do it in one instruction (he repeats this a couple of times every day
on IRC).

>>
>>     return bits + 1;
>> }
>
>>
>> /** Determines the smallest number of bits needed to encode the filter
>>  *  coefficients, and if it's possible to right-shift their values without
>>  *  losing any precision.
>>  */
>> static void code_filter_coeffs(MLPEncodeContext *ctx,
>>                                unsigned int channel, unsigned int filter,
>>                                int *pcoeff_shift, int *pcoeff_bits)
>> {
>>     FilterParams *fp = &ctx->channel_params[channel].filter_params[filter];
>
> hmm, why not pass FilterParams as argument?

Done.

>>     int min = INT_MAX, max = INT_MIN;
>>     int bits, shift;
>>     int or = 0;
>>     int order;
>>
>>     for (order = 0; order < fp->order; order++) {
>>         int coeff = fp->coeff[order];
>>
>
>
>>         if (coeff < min)
>>             min = coeff;
>>         if (coeff > max)
>>             max = coeff;
>>
>>         or |= coeff;
>>     }
>>
>>     bits = FFMAX(number_sbits(min), number_sbits(max));
>
> max= FFMAX(max, FFABS(coeff))
>
> that way you also do not need a signed "bits counter" ...

Hmmm... Either I'm missing something or it's not that simple.
number_sbits() is not as easy to deal with:
 -5: 4
 -4: 3
 -3: 3
 -2: 2
 -1: 1
  0: 1
  1: 2
  2: 3
  3: 3
  4: 4

> [...]
>> /** Inputs data from the samples passed by lavc into the context, shifts them
>>  *  appropriately depending on the bit-depth, and calculates the
>>  *  lossless_check_data that will be written to the restart header.
>>  */
>> static void input_data_internal(MLPEncodeContext *ctx, const uint8_t *samples,
>>                                 int32_t *lossless_check_data, int is24)
>> {
>>     const int32_t *samples_32 = (const int32_t *) samples;
>>     const int16_t *samples_16 = (const int16_t *) samples;
>>     unsigned int substr;
>>
>>     for (substr = 0; substr < ctx->num_substreams; substr++) {
>>         DecodingParams *dp = &ctx->decoding_params[substr];
>>         RestartHeader  *rh = &ctx->restart_header [substr];
>>         int32_t temp_lossless_check_data = 0;
>>         unsigned int channel;
>>         int i;
>>
>>         for (i = 0; i < dp->blocksize; i++) {
>>             for (channel = 0; channel <= rh->max_channel; channel++) {
>>                 int32_t sample;
>>
>>                 if (is24) sample = *samples_32++;
>>                 else      sample = *samples_16++;
>>
>>                 sample <<= dp->quant_step_size[channel];
>>
>>                 temp_lossless_check_data ^= (sample & 0x00ffffff) << channel;
>>                 ctx->sample_buffer[i][channel] = sample;
>>             }
>>         }
>>
>>         lossless_check_data[substr] = temp_lossless_check_data;
>>     }
>> }
>>
>> /** Wrapper function for inputting data in two different bit-depths. */
>> static void input_data(MLPEncodeContext *ctx, void *samples,
>>                        int32_t *lossless_check_data)
>> {
>>     if (ctx->avctx->sample_fmt == SAMPLE_FMT_S24)
>>         input_data_internal(ctx, samples, lossless_check_data, 1);
>>     else
>>         input_data_internal(ctx, samples, lossless_check_data, 0);
>> }
>
> ok though i would prefer if no copying where needed but it may be cleaner
> as is due to 16/32 support
>
>
>
>>
>> /** Determines the best filter parameters for the given data and writes the
>>  *  necessary information to the context.
>
>>  *  TODO Add actual filter predictors!
>
> yes

FIR added.

>>  */
>> static void set_filter_params(MLPEncodeContext *ctx,
>>                               unsigned int channel, unsigned int filter,
>>                               int restart_frame)
>> {
>>     FilterParams *fp = &ctx->channel_params[channel].filter_params[filter];
>>
>>     /* Restart frames must not depend on filter state from previous frames. */
>>     if (restart_frame) {
>>         fp->order    =  0;
>>         return;
>>     }
>>
>>     if (filter == FIR) {
>>         fp->order    =  1;
>>         fp->shift    =  0;
>>         fp->coeff[0] =  1;
>>     } else { /* IIR */
>>         fp->order    =  0;
>>         fp->shift    =  0;
>>     }
>> }
>
>
>>
>> #define INT24_MAX ((1 << 23) - 1)
>> #define INT24_MIN (~INT24_MAX)
>>
>> #define MSB_MASK(bits)  (-1u << bits)
>>
>> /* TODO What substream to use for applying filters to channel? */

Fixed =)

>> /** Applies the filter to the current samples, and saves the residual back
>>  *  into the samples buffer. If the filter is too bad and overflows the
>>  *  maximum amount of bits allowed (24), the samples buffer is left as is and
>>  *  the function returns -1.
>>  */
>
> i do not belive overflow is possible
> without checking the code too carefull i suspect you got the wrap around wrong
> somewhere
> remember (a+b) & MASK = c & MASKis the same as a & MASK = (c-b) & MASK

I got a sample from Justin that overflows at this point. I don't see
how masking could help, since the mask only clears LSBs. And I don't
see how it could wrap around (it shouldn't).

> [...]
>>
>> /** Min and max values that can be encoded with each codebook. The values for
>>  *  the third codebook take into account the fact that the sign shift for this
>>  *  codebook is outside the coded value, so it has one more bit of precision.
>>  *  It should actually be -7 -> 7, shifted down by 0.5.
>>  */
>> static int codebook_extremes[3][2] = {
>>     {-9, 8}, {-8, 7}, {-15, 14},
>> };
>
> hmm, isnt the 3rd wrong now?

No. It's still easier to check for extremes this way. There is still a
if (codebok==2)lsb_bits++; or something. The code that writes the
codes to the bitstream doesn't need it though.

>> typedef struct BestOffset {
>>     int16_t offset;
>>     int bitcount;
>>     int lsb_bits;
>> } BestOffset;
>>
>
>> /** Determines the least amount of bits needed to encode the samples using no
>>  *  codebooks.
>>  */
>> static void no_codebook_bits(MLPEncodeContext *ctx, unsigned int substr,
>>                              unsigned int channel,
>>                              int32_t min, int32_t max,
>>                              BestOffset *bo)
>> {
>>     ChannelParams  *cp = &ctx->channel_params[channel];
>>     DecodingParams *dp = &ctx->decoding_params[substr];
>>     int16_t offset;
>>     int32_t unsign;
>>     uint32_t diff;
>>     int lsb_bits;
>>
>>     /* Set offset inside huffoffset's boundaries by adjusting extremes
>>      * so that more bits are used thus shifting the offset. */
>
>>     if (min < HUFF_OFFSET_MIN)
>>         max = FFMAX(max, HUFF_OFFSET_MIN + HUFF_OFFSET_MIN - min + 1);
>
> 2*HUFF_OFFSET_MIN
>
>
>>     if (max > HUFF_OFFSET_MAX)
>>         min = FFMIN(min, HUFF_OFFSET_MAX + HUFF_OFFSET_MAX - max - 1);
>
> ditto

Done.

> [...]
>> /** Writes the residuals to the bitstream. That is, the vlc codes from the
>>  *  codebooks (if any is used), and then the residual.
>>  */
>> static void write_block_data(MLPEncodeContext *ctx, PutBitContext *pb,
>>                              unsigned int substr)
>> {
>>     DecodingParams *dp = &ctx->decoding_params[substr];
>>     RestartHeader  *rh = &ctx->restart_header [substr];
>>     int32_t sign_huff_offset[MAX_CHANNELS];
>>     int codebook            [MAX_CHANNELS];
>>     int lsb_bits            [MAX_CHANNELS];
>>     unsigned int i, ch;
>>
>>     for (ch = rh->min_channel; ch <= rh->max_channel; ch++) {
>>         ChannelParams *cp = &ctx->channel_params[ch];
>>         int sign_shift;
>>
>>         lsb_bits        [ch] = cp->huff_lsbs - dp->quant_step_size[ch];
>
>>         codebook        [ch] = cp->codebook  - 1;
>
> why -1 ?

Just because it's easier to access ff_mlp_huffman_tables[codebook[ch]] this way.

>>         sign_huff_offset[ch] = cp->huff_offset;
>>
>>         sign_shift = lsb_bits[ch] - 1;
>>
>>         if (codebook[ch] >= 0) {
>>             sign_huff_offset[ch] -= 7 << lsb_bits[ch];
>>             sign_shift += 2 - codebook[ch];
>>         }
>>
>>         /* Unsign if needed. */
>>         if (sign_shift >= 0)
>>             sign_huff_offset[ch] -= 1 << sign_shift;
>>     }
>>
>>     for (i = 0; i < dp->blocksize; i++) {
>>         for (ch = rh->min_channel; ch <= rh->max_channel; ch++) {
>>             int32_t sample = ctx->sample_buffer[i][ch] >> dp->quant_step_size[ch];
>>
>>             sample -= sign_huff_offset[ch];
>>
>
>>             if (codebook[ch] >= 0) {
>>                 int8_t vlc = sample >> lsb_bits[ch];
>>                 put_bits(pb, ff_mlp_huffman_tables[codebook[ch]][vlc][1],
>>                              ff_mlp_huffman_tables[codebook[ch]][vlc][0]);
>
> why int8_t and not int?

Changed.

>>             }
>>
>>             put_sbits(pb, lsb_bits[ch], sample);
>
> are you assuming here that ths masks the high bits off? you should not

put_bits() doesn't, but put_sbits() does.

> [...]
>
>>
>> /** Writes the access unit and substream headers to the bitstream. */
>> static void write_frame_headers(MLPEncodeContext *ctx, uint8_t *frame_header,
>>                                 uint8_t *substream_headers, unsigned int length,
>>                                 uint16_t substream_data_len[MAX_SUBSTREAMS])
>> {
>>     uint16_t access_unit_header = 0;
>>     uint16_t parity_nibble = 0;
>>     unsigned int substr;
>>
>>     parity_nibble  = ctx->timestamp;
>>     parity_nibble ^= length;
>>
>>     for (substr = 0; substr < ctx->num_substreams; substr++) {
>>         uint16_t substr_hdr = 0;
>>
>>         substr_hdr |= (0 << 15); /* extraword */
>>         substr_hdr |= (0 << 14); /* ??? */
>>         substr_hdr |= (1 << 13); /* checkdata */
>>         substr_hdr |= (0 << 12); /* ??? */
>>         substr_hdr |= (substream_data_len[substr] / 2) & 0x0FFF;
>>
>>         AV_WB16(substream_headers, substr_hdr);
>>
>>         parity_nibble ^= *substream_headers++;
>>         parity_nibble ^= *substream_headers++;
>>     }
>>
>>     parity_nibble ^= parity_nibble >> 8;
>>     parity_nibble ^= parity_nibble >> 4;
>>     parity_nibble &= 0xF;
>>
>>     access_unit_header |= (parity_nibble ^ 0xF) << 12;
>>     access_unit_header |= length & 0xFFF;
>>
>>     AV_WB16(frame_header  , access_unit_header);
>>     AV_WB16(frame_header+2, ctx->timestamp    );
>> }
>
> ok
>
>
> [...]
>> static int mlp_encode_frame(AVCodecContext *avctx, uint8_t *buf, int buf_size,
>>                             void *data)
>> {
>>     DecodingParams decoding_params[MAX_SUBSTREAMS];
>>     uint16_t substream_data_len[MAX_SUBSTREAMS];
>>     int32_t lossless_check_data[MAX_SUBSTREAMS];
>>     ChannelParams channel_params[MAX_CHANNELS];
>>     MLPEncodeContext *ctx = avctx->priv_data;
>>     uint8_t *buf2, *buf1, *buf0 = buf;
>>     int total_length;
>>     unsigned int substr;
>>     int restart_frame;
>>
>>     if (avctx->frame_size > MAX_BLOCKSIZE) {
>>         av_log(avctx, AV_LOG_ERROR, "Invalid frame size (%d > %d)\n",
>>                avctx->frame_size, MAX_BLOCKSIZE);
>>         return -1;
>>     }
>>
>>     memcpy(decoding_params, ctx->decoding_params, sizeof(decoding_params));
>>     memcpy(channel_params, ctx->channel_params, sizeof(channel_params));
>>
>>     if (buf_size < 4)
>>         return -1;
>>
>>     /* Frame header will be written at the end. */
>>     buf      += 4;
>>     buf_size -= 4;
>>
>
>>     restart_frame = !(avctx->frame_number & (MAJOR_HEADER_INTERVAL - 1));
>
> gop_size could be used here

The rest of the code is good to have ctx->major_header_interval set by
the user, but it's still not as easy as just using avctx->gop_size,
since it is for video only. (There's another thread about this
somewhere...)

[...]

Ramiro Polla
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mlpenc.c
Type: text/x-csrc
Size: 54372 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080821/e1f5628f/attachment.c>