[FFmpeg-devel] Nellymoser encoder
Bartlomiej Wolowiec
bartek.wolowiec
Thu Aug 28 12:53:50 CEST 2008
Thursday 28 August 2008 00:11:20 Michael Niedermayer napisa?(a):
[...]
> > + DSPContext dsp;
> > + MDCTContext mdct_ctx;
> > + DECLARE_ALIGNED_16(float, mdct_out[NELLY_SAMPLES]);
> > + DECLARE_ALIGNED_16(float, buf[2 * NELLY_SAMPLES]); ///< sample
> > buffer +} NellyMoserEncodeContext;
> > +
> >
> > +static DECLARE_ALIGNED_16(float, sine_window[NELLY_SAMPLES]);
>
> duplicate of ff_sine_windows and sine_window form nellymoserdec
not really, sine_window from nellymoserdec is just a half of it. I haven't
compared efficiency, but it seems to me that vector_fmul may be quicker than
overlap_and_window? from nellymoserdec using half of sine_window. Or maybe I
miss some details...
[...]
> > +/**
> > + * Searching index in table with size table_size, where
> > + * |val-table[best_idx]| is minimal.
> > + * It assumes that table elements are in increasing order and uses
> > binary search. + */
> > +#define find_best_value(val, table, table_size, best_idx) \
> > +{ \
> > + int first=0, last=table_size-1, mid; \
> > + while(first<=last){ \
> > + mid=(first+last)/2; \
> > + if(val > table[mid]){ \
> > + first = mid + 1; \
> > + }else{ \
> > + last = mid - 1; \
> > + } \
> > + } \
> > + if(!first || (first!=table_size && table[first]-val <
> > val-table[last])) \ + best_idx = first; \
> > + else \
> > + best_idx = last; \
> > +}
>
> This can be done faster with a look up table
> and a single right value vs. left value check
Ok, I may do it for ff_nelly_init_table and ff_nelly_delta_table, but I don't
really now how to do it for float type (ff_nelly_dequantization_table)
> > +
> > +/**
> > + * Encodes NELLY_SAMPLES samples. It assumes, that samples contains 3 *
> > NELLY_BUF_LEN values + * @param s encoder context
> > + * @param output output buffer
> > + * @param output_size size of output buffer
> > + * @param samples input samples
> > + */
> > +static void encode_block(NellyMoserEncodeContext *s,
> > + unsigned char *output, int output_size, float
> > *samples) +{
> > + PutBitContext pb;
> > + int i, band, block, best_idx, power_idx = 0;
> > + float power_val, power_candidate, coeff, coeff_sum;
> > + int band_start, band_end;
> > +
> > + apply_mdct(s, samples, s->mdct_out);
> > + apply_mdct(s, samples + NELLY_BUF_LEN, s->mdct_out + NELLY_BUF_LEN);
> > +
> > + init_put_bits(&pb, output, output_size * 8);
> > +
> > + band_start = 0;
> > + band_end = ff_nelly_band_sizes_table[0];
> > + for (band = 0; band < NELLY_BANDS; band++) {
> > + coeff_sum = 0;
> > + for (i = band_start; i < band_end; i++) {
> >
> > + for (block = 0; block < 2; block++) {
> > + coeff = s->mdct_out[i + block * NELLY_BUF_LEN];
> > + coeff_sum += coeff * coeff;
> > + }
>
> id unroll that by hand to
> coeff_sum += s->mdct_out[i ]*s->mdct_out[i ];
> +s->mdct_out[i + NELLY_BUF_LEN]*s->mdct_out[i + NELLY_BUF_LEN];
>
> > + }
> > + power_candidate =
> > + (log(FFMAX(64.0, coeff_sum /
> > (ff_nelly_band_sizes_table[band] << 1))) - + log(64.0)) *
> > 1024.0 / M_LN2;
>
> log(FFMAX(1.0, coeff_sum / (ff_nelly_band_sizes_table[band] << 7))) *
> 1024.0 / M_LN2;
>
> also this is based on
> (sum(0..N) ABS(coeff)^2/N)^(1/2)
>
> it would be interresting to try
> C*(sum(0..N) ABS(coeff)^D/N)^(1/D) for different values of C and D
>
> maybe you could try
> C={0.9,1.0,1.1}
> D={1.9,2.0,2.1}
> at first and see if any improves distortion
Hmm... How should I check distortion? I've listened to few recorgings and in
my opinion differences are insignificant - sometimes D=2.0 is better,
sometimes D=2.3... C!=1.0 in my opinion doesn't give better effects.
> > +
> > + if (band) {
> > + power_candidate -= power_idx;
> > + find_best_value(power_candidate, ff_nelly_delta_table, 32,
> > best_idx); + put_bits(&pb, 5, best_idx);
> > + power_idx += ff_nelly_delta_table[best_idx];
> > + } else {
> > + //base exponent
> > + find_best_value(power_candidate, ff_nelly_init_table, 64,
> > best_idx); + put_bits(&pb, 6, best_idx);
> > + power_idx = ff_nelly_init_table[best_idx];
> > + }
>
> I wish i knew how to optimally assign these values, sadly i do not.
> Suggestions would be welcome of course in case anyone has an idea on how
> to optimally select them, the tricky part is that these not only scale the
> signal, they also are the basis upon which the bits per band and thus
> encoding is selected.
>
> Still they could be made to closer match the "power_candidate" values from
> above using viterbi though arguably it would just be closer to a guess.
>
> An alternative may be to just retry the whole encode_block with slightly
> changed power_candidate values for each band and pick what end up with the
> least distortion (that is least difference to the input signal)
> This should be rather easy to try ...
slightly change ? What do you exactly? mean? And again a problem how to
measure distortion - common difference mdct won't give a good effect.
> > +
> > + if (power_idx >= 0) {
> > + power_val = pow_table[power_idx & 0x7FF] / (1 << (power_idx
> > >> 11)); + } else {
> > + power_val = -pow(2, -power_idx / 2048.0 - 3.0);
> > + }
>
> power_idx can be <0 ?
Yes. In this encoder code it can be, possibly in original decoder too.
--
Bartlomiej Wolowiec
More information about the ffmpeg-devel
mailing list