[FFmpeg-devel] [PATCH] AAC Decoder - Round 2.
Michael Niedermayer
michaelni
Sun Jun 29 16:46:48 CEST 2008
On Sun, Jun 29, 2008 at 03:19:24PM +0100, Robert Swain wrote:
> 2008/6/29 Robert Swain <robert.swain at gmail.com>:
> > 2008/6/29 Michael Niedermayer <michaelni at gmx.at>:
> >> On Sun, Jun 29, 2008 at 10:42:40AM +0100, Robert Swain wrote:
> >>> 2008/6/27 Michael Niedermayer <michaelni at gmx.at>:
> >>> > On Fri, Jun 27, 2008 at 03:35:06PM +0100, Robert Swain wrote:
> >>> >> 2008/6/23 Michael Niedermayer <michaelni at gmx.at>:
> >>> >> > On Mon, Jun 23, 2008 at 02:10:56PM +0100, Robert Swain wrote:
> >>> >> >> 2008/6/20 Michael Niedermayer <michaelni at gmx.at>:
> >>> >> >> > On Thu, Jun 19, 2008 at 04:22:57PM +0100, Robert Swain wrote:
> >>> >> >> > [...]
> >>> >> >> >> +
> >>> >> >> >> + for (g = 0; g < ics->num_window_groups; g++) {
> >>> >> >> >> + for (i = 0; i < ics->max_sfb; i++) {
> >>> >> >> >> + if (cb[g][i] == NOISE_HCB) {
> >>> >> >> >> + for (group = 0; group < ics->group_len[g]; group++) {
> >>> >> >> >> + float energy = 0;
> >>> >> >> >> + float scale = 1.;// / (float)(offsets[i+1] - offsets[i]);
> >>> >> >> >> + for (k = offsets[i]; k < offsets[i+1]; k++)
> >>> >> >> >> + energy += (float)icoef[group*128+k] * icoef[group*128+k];
> >>> >> >> >> + scale *= sf[g][i] / sqrt(energy);
> >>> >> >> >
> >>> >> >> > are you sure that the random values have to be normalized like that?
> >>> >> >> > I suspect energy is supposed tp be a constant.
> >>> >> >>
> >>> >> >> That's how it is in the spec. From section 4.6.13 Perceptual Noise
> >>> >> >> Substitution (PNS):
> >>> >> >
> >>> >> > Ive checked the spec before my reply, and i belive your code is wrong.
> >>> >> >
> >>> >> >> The energy information for percpetual noise substitution decoding is
> >>> >> >> represented by a "noise energy" value indicating the overall power of
> >>> >> >> the substituted spectral coefficients in steps of 1.5 dB. If noise
> >>> >> >> substitution coding is active for a particular group and scalefactor
> >>> >> >> band, a noise energy value is transmitted instead of the scalefactor
> >>> >> >> of the respective channel.
> >>> >> >
> >>> >> > Doesnt say that the output from the random number generator should be choped
> >>> >> > up in bands and each independantly renormalized.
> >>> >> >
> >>> >> > Heres what the spec says:
> >>> >> > /* Decode noise energies for this group */
> >>> >> > for (sfb=0; sfb<max_sfb; sfb++)
> >>> >> > if (is_noise(g,sfb))
> >>> >> > noise_nrg[g][sfb] = nrg += dpcm_noise_nrg[g][sfb];
> >>> >> > /* Do perceptual noise substitution decoding */
> >>> >> > for (b=0; b<window_group_length[g]; b++) {
> >>> >> > for (sfb=0; sfb<max_sfb; sfb++) {
> >>> >> > if (is_noise(g,sfb)) {
> >>> >> > offs = swb_offset[sfb];
> >>> >> > size = swb_offset[sfb+1] - offs;
> >>> >> > /* Generate random vector */
> >>> >> > gen_rand_vector( &spec[g][b][sfb][0], size );
> >>> >> > scale = 1/(size * sqrt(MEAN_NRG));
> >>> >> > scale *= 2.0^(0.25*noise_nrg [g][sfb]);
> >>> >> > /* Scale random vector to desired target energy */
> >>> >> > for (i=0; i<len; i++)
> >>> >> > spec[g][b][sfb][i] *= scale;
> >>> >> > }
> >>> >> > }
> >>> >> > }
> >>> >> > ...
> >>> >> > The function gen_rand_vector( addr, size ) generates a vector of length <size> with signed random values of
> >>> >> > average energy MEAN_NRG per random value. A suitable random number generator can be realized using one
> >>> >> > multiplication/accumulation per random value.
> >>> >> >
> >>> >> > No weird renormalization!
> >>> >> > also the size factor is commented out in our code, i guess to cancel the
> >>> >> > incorrect normalization mostly out.
> >>> >>
> >>> >> OK, I understand what you mean now. I see that the current code we
> >>> >> have calculates the energy of the band in question (hence there's no
> >>> >> need for /size) and scales to that energy rather than scaling to
> >>> >> 1/(size * MEAN_NRG). I have a few questions:
> >>> >>
> >>> >
> >>> >> - Is the av_random() & 0xFFFF code OK or should the 32-bit random
> >>> >> value rather be scaled to 16-bit? I suspect it should be scaled (i.e.
> >>> >> >> 16).
> >>> >
> >>> > Why dont you use the whole 32bit value? Or was the array int16_t in
> >>> > which case >>16 is as good as &0xFFFF given the random number generator
> >>> > is good.
> >>>
> >>> I might as well and I'll probably move the av_random() calls to
> >>> quant_to_spec_tool().
> >>>
> >>> >> - How does one analytically calculate the mean energy of the noise we
> >>> >> generate such that this value can be defined as a constant and used in
> >>> >> our code?
> >>> >
> >>> > How do you define energy?
> >>>
> >>> For a discrete signal x_i for i = [0, N], the energy of the signal is
> >>> sum(x_i * x_i, i=0, i=N).
> >>>
> >>> > What is the distribution of the values av_random() outputs?
> >>>
> >>> /** generates a random number on [0,0xffffffff]-interval */
> >>>
> >>> If it claims to generate a random number on that interval then without
> >>> stating any bias in the distribution, it should mean generates values
> >>> in that interval with equal probability.
> >>>
> >>> >> After assuming white noise and dabbling about a bit with
> >>> >> some maths it seems it should be 1/3 * maximum possible energy i.e.
> >>> >> #define MEAN_NRG sqrt(3.0) * 32768.0 or something like that. Is this
> >>> >> correct?
> >>> >
> >>> > If you know how you define energy then you can easily check this by
> >>> > calculating the exact nummerical energy of many values from av_random()
> >>> > your calculation should be close to the value returned ...
> >>>
> >>> I tried my sum for large values of N with x_i on the interval [0.0,
> >>> 1.0] and it came close to 1/3.
> >>>
> >>> Ideally, for the mean energy, one would use the limit as N->inf of the
> >>> above-mentioned sum / N
> >>>
> >>> > If you further assume random() has a flat equi distibuted output than
> >>> > a simple
> >>> > for(i=0; i<=0xFFFF; i++)
> >>> > find_energy(&energy, i);
> >>> > will give you the exact value
> >>> > ... the analytical one, assuming you solved the correct integral ;) is
> >>> > actually just an approximation for the continuous case ...
> >>>
> >>> Right, and I was just wondering if you could clarify the continuous case. :)
> >>>
> >>> By the way, I've just looked at the FAAD and 3GPP implementations and
> >>> they both scale per random vector, i.e. per band as the current code
> >>> does. They both actually do the scaling within the random vector
> >>> generation function though I guess FAAD is based off the 3GPP
> >>> reference code.
> >>>
> >>> So, which should we use?
> >>>
> >>> - Scaling per band as in the reference code
> >>> - Scaling with constant mean energy
> >>
> >> the second, because
> >> 1. thats what the spec we have says
> >> 2. its faster
> >> 3. its simpler
> >> 4. its random, rescaling subsets of numbers "dynamically" is not random
> >> anymore.
> >>
> >> If of course all files we had with thsis feature sounded better with the
> >> faad/3gpp variant then we should choose the first but IIRC we dont have
> >> any such files ...
> >>
> >>
> >>>
> >>> If the latter, what precision should be used for the sqrt(3.0) * 2^31 constant?
> >>
> >> as the output is float, a float constant seems logic ...
> >
> > See attached.
>
> Damnit. Why do I always spot small errors right after I've decided I'm
> happy with a patch and have submitted it? The return of av_random is
> unsigned and it needs to be signed so cast to int32_t. See attached
> again. Sorry for the noise.
ok
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
When you are offended at any man's fault, turn to yourself and study your
own failings. Then you will forget your anger. -- Epictetus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080629/c58565b2/attachment.pgp>
More information about the ffmpeg-devel
mailing list