[FFmpeg-devel] [PATCH] AAC Decoder round 3

Tue Jul 8 02:25:52 CEST 2008

2008/7/8 Michael Niedermayer <michaelni at gmx.at>:
> On Mon, Jul 07, 2008 at 05:57:54PM +0100, Robert Swain wrote:
>> 2008/7/7 Robert Swain <robert.swain at gmail.com>:
>> > 2008/7/6 Michael Niedermayer <michaelni at gmx.at>:
>> >> On Tue, Jul 01, 2008 at 12:37:11PM +0100, Robert Swain wrote:
>> >> [...]
>> >>> +static void coupling_tool(AACContext * ac, int independent, int domain) {
>> >>
>> >> This as well as other functions could benefit from some documentation
>> >> what does it do, what is "independent" what "domain" ?
>> >
>> > independent == 0 : dependent channel coupling
>> > independent == 1 : independent channel coupling
>> >
>> > The dependence of the channel coupling has implications on the data
>> > that is shared between the coupled channels and how the data is
>> > processed.
>> >
>> > domain == 0 : apply coupling before TNS decoding
>> > domain == 1 : apply coupling after TNS decoding
>> >
>> > This variable is called cc_domain in the spec.
>> >
>> > If you're not happy with the names, please make suggestions based on
>> > the descriptions. I have added the documentation of these particular
>> > ones and I'm looking at other non-obvious function names and adding
>> > doxygen comments.
>> >
>> >>> +    int i;
>> >>> +    for (i = 0; i < MAX_TAGID; i++) {
>> >>> +        ChannelElement * cc = ac->che[ID_CCE][i];
>> >>> +        if (cc) {
>> >>> +            if (cc->coup.ind_sw && independent) {
>> >>> +                transform_coupling_tool(ac, cc, coupling_independent_trans);
>> >>> +            } else if (!cc->coup.ind_sw && !independent && (cc->coup.domain == domain)) {
>> >>> +                transform_coupling_tool(ac, cc, coupling_dependent_trans);
>> >>> +            }
>> >>> +        }
>> >>> +    }
>> >>> +}
>> >>> +
>> >>
>> >>> +static void transform_sce_tool(AACContext * ac, void (*sce_trans)(AACContext * ac, SingleChannelElement * sce)) {
>> >>
>> >> iam still not happy with _tool() functions. Functions should be named
>> >> so that the names describes what the function does.
>> >
>> > They're called tools in the spec. Looking up blah_tool shouldn't be
>> > too difficult, but I agree it would be better if one didn't have to
>> > look up the meaning of a function name. I'll see what I can do about
>> > this tomorrow.
>>
>> How far does the attached patch go towards aiding understanding of the
>> functions and non-obvious parameters?
>
> 2 steps forward 2 back.
> some things are clarified, many are more confusing with the comments than
> without any.

:/

> [...]
>
>> @@ -731,6 +731,10 @@
>>
>>  /**
>>   * Parse audio specific configuration; reference: table 1.13.
>> + *
>> + * @param   data        pointer to AVCodecContext extradata
>> + * @param   data_size   size of AVCCodecContext extradata
>
> ok
>
>
>> + * @return  Returns error status.
>
> so the even numbers are erros and odd are not? ;)

Clarified.

>>   */
>>  static int AudioSpecificConfig(AACContext * ac, void *data, int data_size) {
>>      GetBitContext gb;
>> @@ -921,6 +925,8 @@
>>
>>  /**
>>   * Decode Individual Channel Stream info; reference: table 4.6.
>> + *
>> + * @param   common_window   Channels have independent [0], or share [1], Individual Channel Stream information.
>
> mostly ok, the [0]/[1] is a little odd but its not hard to make sense off.

I thought it wouldn't be too difficult.

>>   */
>>  static int decode_ics_info(AACContext * ac, GetBitContext * gb, int common_window, IndividualChannelStream * ics) {
>>      uint8_t grouping;
>> @@ -999,6 +1005,12 @@
>>      return 0;
>>  }
>>
>> +/**
>> + * inverse quantization
>> + *
>> + * @param   a   quantized value to be dequantized
>> + * @return  Returns dequantized value.
>> + */
>>  static inline float ivquant(AACContext * ac, int a) {
>>      if (a + (unsigned int)IVQUANT_SIZE/2 - 1 < (unsigned int)IVQUANT_SIZE - 1)
>>          return ivquant_tab[a + IVQUANT_SIZE/2 - 1];
>
> ok, though the comment borders on redundant

I struggled to find this in the spec, so I did my best.

>> @@ -1008,6 +1020,10 @@
>>
>>  /**
>>   * Decode section_data payload; reference: table 4.46.
>> + *
>> + * @param   cb          array of the codebook used for a window group's scalefactor band
>> + * @param   cb_run_end  array of the last scalefactor band of a codebook run for a window group's scalefactor band
>> + * @return  Returns error status.
>>   */
>>  static int decode_section_data(AACContext * ac, GetBitContext * gb, IndividualChannelStream * ics, enum Codebook cb[][64], int cb_run_end[][64]) {
>>      int g;
>
> this one is only confusing, it begins at the use of the term "code book"
> Could you explain how these variables fit any definition of code book?

Hmm, I think cb[][] should be renamed cb_type[][]. Otherwise I see no
problem as they relate to Huffman codebooks. Any further confusion?

>> @@ -1042,6 +1058,13 @@
>>
>>  /**
>>   * Decode scale_factor_data; reference: table 4.47.
>> + *
>> + * @param   mix_gain    Channel gain (not used by AAC bitstream).
>> + * @param   global_gain first scalefactor value as scalefactors are differentially coded
>> + * @param   cb          array of the codebook used for a window group's scalefactor band
>> + * @param   cb_run_end  array of the last scalefactor band of a codebook run for a window group's scalefactor band
>> + * @param   sf          array of scalefactors or intensity stereo positions used for a window group's scalefactor band
>> + * @return  Returns error status.
>>   */
>>  static int decode_scale_factor_data(AACContext * ac, GetBitContext * gb, float mix_gain, unsigned int global_gain,
>>          IndividualChannelStream * ics, const enum Codebook cb[][64], const int cb_run_end[][64], float sf[][64]) {
>
>> @@ -1100,6 +1123,9 @@
>>      return 0;
>>  }
>>
>> +/**
>> + * Decode pulse data; reference: table 4.7.
>> + */
>>  static void decode_pulse_data(AACContext * ac, GetBitContext * gb, Pulse * pulse) {
>>      int i;
>
> redundant

It adds a reference to the spec. I don't think that's redundant. Maybe you do.

> and what does the _data postfix say?
> could you explain what a decode_pulses() would do different?

All the decode_*_data() are parsing functions I believe. They could be
renamed parse_*() but most are called *_data() in the spec so maybe
the decode_ could be dropped. What would you prefer?

decode_pulses() could be a function taking pulse data in one, encoded,
form and converting to some decoded form. If they were huffman coded
or something like that. *shrugs* I don't really mind as it's easy to
see what the function is doing anyway.

>>  static void pulse_tool(AACContext * ac, const IndividualChannelStream * ics, const Pulse * pulse, int * icoef) {
>
> the name still needs work
> add_pulses() beiing an obvious choice ...

OK.

>>      int i, off = ics->swb_offset[pulse->start];
>>      for (i = 0; i < pulse->num_pulse; i++) {
>> @@ -1246,6 +1288,14 @@
>>      }
>>  }
>>
>> +/**
>> + * Dequantize and scale spectral data; reference: 4.6.3.3.
>> + *
>> + * @param   icoef   array of quantized spectral data
>> + * @param   cb      array of the codebook used for a window group's scalefactor band
>> + * @param   sf      array of scalefactors or intensity stereo positions used for a window group's scalefactor band
>> + * @param   coef    array of dequantized, scaled spectral data
>> + */
>
>>  static void quant_to_spec_tool(AACContext * ac, const IndividualChannelStream * ics, const int * icoef,
>
> dequant() :)

OK.

>>          const enum Codebook cb[][64], const float sf[][64], float * coef) {
>>      const uint16_t * offsets = ics->swb_offset;
>> @@ -1276,6 +1326,10 @@
>>
>>  /**
>>   * Decode an individual_channel_stream payload; reference: table 4.44.
>> + *
>> + * @param   common_window   Channels have independent [0], or share [1], Individual Channel Stream information.
>
>> + * @param   scale_flag
>
> yes, what did you want to say here? :)

I'll add a comment but this variable will be unused until scalable AAC
is implemented. I was going to remove it but as it may not be
redundant in the future I wasn't so sure. Opinions?

>> + * @return  Returns error status.
>>   */
>>  static int decode_ics(AACContext * ac, GetBitContext * gb, int common_window, int scale_flag, SingleChannelElement * sce) {
>>      int icoeffs[1024];
>> @@ -1326,6 +1380,9 @@
>>      return 0;
>>  }
>>
>> +/**
>> + * Mid/Side stereo decoding; reference: 4.6.8.1.3.
>> + */
>>  static void ms_tool(AACContext * ac, ChannelElement * cpe) {
>>      const MidSideStereo * ms = &cpe->ms;
>>      const IndividualChannelStream * ics = &cpe->ch[0].ics;
>> @@ -1353,7 +1410,9 @@
>>      }
>>  }
>>
>> -
>> +/**
>> + * intensity stereo decoding; reference: 4.6.8.2.3.
>> + */
>>  static void intensity_tool(AACContext * ac, ChannelElement * cpe) {
>
> above 2 are not very informative

Again, they add references to the spec. If you don't consider that
sufficient reason for the comment then I can remove them.

>>      const IndividualChannelStream * ics = &cpe->ch[1].ics;
>>      SingleChannelElement * sce1 = &cpe->ch[1];
>> @@ -1382,6 +1441,9 @@
>>
>>  /**
>>   * Decode a channel_pair_element; reference: table 4.4.
>> + *
>> + * @param   id  identifies the instance of a syntax element
>> + * @return  Returns error status.
>>   */
>>  static int decode_cpe(AACContext * ac, GetBitContext * gb, int id) {
>>      int i;
>
> iam as smart as without the comment (i could RTFS but that defeats the puprose
> of the doxy)

Maybe you are but I thought clarification of 'id' was needed.

>> @@ -1415,6 +1477,12 @@
>>      return 0;
>>  }
>>
>> +/**
>> + * Decode coupling_channel_element; reference: table 4.8.
>> + *
>> + * @param   id  identifies the instance of a syntax element
>> + * @return  Returns error status.
>> + */
>>  static int decode_cce(AACContext * ac, GetBitContext * gb, int id) {
>>      int num_gain = 0;
>>      int c, g, sfb;
>> @@ -1479,14 +1547,25 @@
>>      return 0;
>>  }
>>
>> +/**
>> + * Parse Spectral Band Replication extension data; reference: table 4.55.
>> + *
>
>> + * @param   crc flag indicating the presence of CRC data
>
> s/data/checksum/ i assume ...

Done.

> [..]
>> @@ -1577,6 +1662,12 @@
>>      return res;
>>  }
>>
>> +/**
>> + * Decode TNS filter coefficients and apply all-pole filters; reference: 4.6.9.3.
>
> TNS = temporal noise shaping?

Yes. In this instance I was actually trying to limit the length of the line. :)

>> + *
>> + * @param   decode  1 if tool is used normally, 0 if tool is used in LTP
>
> hmm

I didn't find any useful documentation of this variable or its
semantics in the spec so, again, I did my best.

>> +/**
>> + * tns_filter_tool wrapper to make interface consistent.
>> + */
>
> yes and i hate this function :)

I could remove all these wrappers if you like. I don't really see the
point in them either.

>>  static void tns_trans(AACContext * ac, SingleChannelElement * sce) {
>>      if(sce->tns.present) tns_filter_tool(ac, 1, sce, sce->coeffs);
>>  }
>> @@ -1729,6 +1823,9 @@
>>  }
>>  #endif /* AAC_LTP */
>>
>> +/**
>> + * Conduct IMDCT and windowing.
>> + */
>>  static void window_trans(AACContext * ac, SingleChannelElement * sce) {
>
> so why isnt it called imdct_and_windowing() ?
> its not transforming a window like the name would hint at ...

OK.

>>      IndividualChannelStream * ics = &sce->ics;
>>      float * in = sce->coeffs;
>> @@ -1910,6 +2007,11 @@
>>  }
>>  #endif /* AAC_SSR */
>>
>> +/**
>> + * Apply dependent channel coupling.
>> + *
>> + * @param   index   which gain to use for coupling
>> + */
>>  static void coupling_dependent_trans(AACContext * ac, ChannelElement * cc, SingleChannelElement * sce, int index) {
>>      IndividualChannelStream * ics = &cc->ch[0].ics;
>>      const uint16_t * offsets = ics->swb_offset;
>> @@ -1938,6 +2040,11 @@
>>      }
>>  }
>>
>> +/**
>> + * Apply independent channel coupling.
>> + *
>> + * @param   index   which gain to use for coupling
>> + */
>
> Iam still a little confused about what the difference is between dependant
> and independent channel coupling (without RTFS...)

It's confusing when you read the spec too. :)

Rob