[FFmpeg-devel] [PATCH] AAC Encoder, Round 2
Kostya
kostya.shishkov
Mon Aug 25 12:09:54 CEST 2008
On Sun, Aug 24, 2008 at 09:27:44PM +0200, Michael Niedermayer wrote:
> On Sun, Aug 24, 2008 at 09:05:54PM +0300, Kostya wrote:
> > On Sun, Aug 24, 2008 at 06:45:58PM +0200, Michael Niedermayer wrote:
[...]
> > > > 3. Encoder performs windowing and MDCT (and grouping?)
> > >
> > > i dont think grouping can be done at this point, at least not optimally.
> >
> > well, from my POV, you can just merge groups with similar scalefactors after
> > they are known
>
> well you dont know the scalefactors yet ...
> besides what is "similar"
I've seen two empty consequent window groups in a frame sometimes that could be
merged, otherwise I can't say how to perform grouping.
> > > > 4. Model calculates perceptual entropy and thresholds
> > > > 5. Ratecontrol module in encoder uses them to produce final thresholds
> > > > 5.1 maybe it will call psy model to calculate perceptual distortion for the band
> > > > 6. Encoder quantizes input with scalefactors
> > > > 7. Encoder determines and encodes band info and coefficients
> > > > 8. Fetch next frame and goto step 1 unless it was the last frame
> > > >
> > > > Any ideas/suggestions/patches?
> > >
> > > Iam not sure, this is quite vague
> > >
> > >
> > > A few points that are IMO important
> > > * decissions must NOT be bundled into psy models, that is when we implement
> > > 3 differnt heuristics to choose the MDCT/window size they must be choosable
> > > independant of the remaining unrelated psy model, this also applies to
> > > things like stereo attenution coeffs, the way low/highpass cutoff is
> > > choosen and so on ...
> >
> > then how? select separate module for each psy step?
>
> not sure i would call it "module" but yes in princple
>
> i was more thinking of
> if(avctx->something == something){
> }else{
> }
> though, the struct, function point, ... system seem a little overkill here
So, should I reduce psy model to filling up Psy3gppBand data and move
rate control and quantization to encoder?
> >
> > > * The primary goal is highest quality encoding, anything that would make
> > > achiving this goal harder will be rejected.
> >
> > Well, I can implement it in [...] time :)
>
> great ;)))
with the plan, of course
> >
> > > * coeff quantization and scalefactors must be decided based on RD.
> > > Its perfectly fine to support faster alternatives in addition ...
> >
> > I think that should be done in encoder.
>
> yes
> IMHO the psy model should just tell the encoder how important each band is
> in terms of audibility of distortion that is should provide perceptual weights.
> That way the psy model does not need to mess with anything aac specific ...
> and the encoder can do all the RD, bit counting quantization, ...
> Sadly this is not exactly how the simlpe 3gpp model is designed ...
that's tricky to formulate
> > As I previously mentioned, I like to keep encoder and psy model separated
> > and I like to have them working ASAP.
> >
> > As I have working AAC encoder, I'd like to make it fit for making optimal
> > and perfect it piece by piece then. Rewriting it from scratch will require
> > clear requirements too. So let's settle on some workflow scheme.
>
> i didnt ask for a rewrite ...
>
>
> [...]
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Those who are too smart to engage in politics are punished by being
> governed by those who are dumber. -- Plato
More information about the ffmpeg-devel
mailing list