[Ffmpeg-devel] [PATCH] ATRAC3 decoder

Tue Apr 10 19:55:49 CEST 2007

Hi, better late then never. Fixed all issues from the first review also.

Michael Niedermayer wrote:
> Hi
> 
> On Sun, Feb 18, 2007 at 11:34:29AM +0100, Benjamin Larsson wrote:
>> Incomplete specifications can be found here:
>>
>> http://wiki.multimedia.cx/index.php?title=RealAudio_atrc
>>
>> Samples here:
>>
>> http://samples.mplayerhq.hu/real/AC-atrc/
>>
>> and here:
>>
>> http://samples.mplayerhq.hu/A-codecs/ATRAC3/
>>
>> Currently atrac3 in oma/omg (http://samples.mplayerhq.hu/oma/) or psmf
>> (http://samples.mplayerhq.hu/PSMF/) is not supported. Transcoding
>> to/from the rm container would need a bitstream filter which is not
>> supported either (yet).
>>
>> Any clarifications or fixes are welcome.
> 
> ok, amendment to the 1st review below
> 
> 
> [...]
>> /* quadrature mirror synthesis filter */
>> static void iqmf (float *inlo, float *inhi, unsigned int nIn, float *pOut, float *delayBuf, float *temp, float *pWindow)
> 
> whats the difference between this and the qmf code in dca.c?
> of they are similar they should be merged maybe in a qmf.c ...

I am not exactly sure what the difference between the two are but one
thing is the filter coeff length. DCA has a longer filter then ATRAC3.
The dca implementations is cosine modulated. This one isn't.

> 
> 
> [...]
>> void IMLT_NoOverlap (float *pInput, float *pOutput, int odd_band)
>> {
>>     //FIXME alignment problem when using SIMD ?
>>     float   ref_out[512];
>>     int     i;
>>
>>     if (odd_band) {
>>         /**
>>         * Reverse the odd bands before IMDCT, this is an effect of the QMF transform
>>         * or it gives better compression to do it this way.
>>         * FIXME: It should be possible to handle this in ff_imdct_calc
>>         * for that to happen a modification of the prerotation step of
>>         * all SIMD code and C code is needed.
>>         * Or fix the functions before so they generate a pre reversed spectrum.
>>         */
>>         memcpy(ref_out,pInput,256*sizeof(float));
>>         for (i=0; i<256; i++)
>>             pInput[i] = ref_out[255-i];
> 
> for (i=0; i<256; i++)
>     FFSWAP(float, pInput[i], pInput[255-i]);
> 
> 
> [...]
>> static int decodeTonalComponents (GetBitContext *gb, int *numComponents, TONE_COMP *pComponent, int numBands)
>> {
>>     int   component_count, components, var48, var4C, i, var54, quant_step_index, cnt, j, var5C, var40;
>>     int   var60, sfIndx, coded_values, eax;
> 
> variable names like var4C or eax are completely unaccptable, also i once
> again like to emphasize that the code must not be a translation of a
> binary codec but rather must be a new implementation

Fixed.

> 
> 
> [...]
> 
> 
>> static int decodeGainControl (GetBitContext *gb, GAIN_BLOCK *pGb, int numBands)
>> {
>>     int   i, cf, numData, loc;
>>     int   *pLevel, *pLoc;
>>
>>     GAIN_INFO   *pGain = pGb->gBlock;
>>
>>     for (i=0 ; i<=numBands; i++)
>>     {
>>         numData = get_bits(gb,3);
>>         pGain[i].num_gain_data = numData;
>>         pLevel = pGain[i].levcode;
>>         pLoc = pGain[i].loccode;
>>
>>         for (cf = 0; cf < numData; cf++)
>>         {
>>             *pLevel = get_bits(gb,4);
>>             loc = get_bits(gb,5);
>>
>>             if ((cf > 0 && loc <= *(pLoc-1)) || (loc << 3) > 248)
>>                 return ERROR;
> 
> and how is (loc << 3) > 248 supposed to be true?

:) Removed.

> 
> 
>>             *pLoc = loc;
>>             pLevel++;
>>             pLoc++;
>>         }
> 
> for (cf = 0; cf < numData; cf++){
>     pLevel[cf]= get_bits(gb,4);
>     pLoc  [cf]= get_bits(gb,5);
>     if(cf && pLoc[cf] <= pLoc[cf-1])
>         return -1;
> }
> 

Merged.

> 
> 
> [...]
>>     if (pGain2->num_gain_data == 0)
>>         gain1 = 1.0;
>>     else
>>         gain1 = gain_tab1[pGain2->levcode[0]];
> 
> what about setting levcode[0]=4 if num_gain_data == 0 then this if could be
> avoided

I see no gain in messing with that.

> 
> 
>>     if (pGain1->num_gain_data == 0) {
>>         for (cnt = 0; cnt < 256; cnt++)
>>             pOut[cnt] = (pIn[cnt] * gain1) + (pPrev[cnt]);
> 
> superflous ()

Removed.

> 
> 
> [...]
>>     /* Delay for the overlapping part. */
>>     memcpy(pPrev, &pIn[256], 256*sizeof(float));
> 
> cant this be replaced by swaping 2 pointers?
> 

Well I'm sure it can be but then the decoder would need
7*512*sizeof(float) more memory (almost true). This is because the all 4
qmf bands are passed through the mdct. If it was only one band then I
think it would be justified but when it is 4 the complexity is too high.
IMHO

> 
> [...]
> 
>>
>> #define INTERPOLATE(old,new,nsample) (((1.0-((float)(nsample)*0.125))*(old)) + (((float)(nsample)*0.125)*(new)))
> 
> #define INTERPOLATE(old,new,nsample) ((old) + (nsample)*0.125*((new)-(old)))
> 

Fixed.

> 
>> static int applyChannelMatrix (float *su1, float *su2, int *pPrevCode, int *pCurrCode)
>> {
>>     int    band, nsample, s1, s2;
>>     float    c1, c2;
>>     float    mc1_l, mc1_r, mc2_l, mc2_r;
>>
>>     for (band = 0; band < 4; band++) {
>>         s1 = pPrevCode[band];
>>         s2 = pCurrCode[band];
>>         nsample = 0;
>>
>>         if (s1 != s2) {
>>             /* Selector value changed, interpolation needed. */
>>             mc1_l = matrixCoeffs[s1*2];
>>             mc1_r = matrixCoeffs[s1*2+1];
>>             mc2_l = matrixCoeffs[s2*2];
>>             mc2_r = matrixCoeffs[s2*2+1];
>>
>>             /* Interpolation is done over the first eight samples. */
>>             for(; nsample < 8; nsample++) {
>>                 c1 = su1[band*256+nsample];
>>                 c2 = su2[band*256+nsample];
>>                 c2 = (c1 * INTERPOLATE(mc1_l,mc2_l,nsample)) + (c2 * INTERPOLATE(mc1_r,mc2_r,nsample));
>>                 su1[band*256+nsample] = c2;
>>                 su2[band*256+nsample] = c1 * 2.0 - c2;
>>             }
>>         }
>>
>>         /* Apply the matrix without interpolation. */
>>         switch (s2) {
>>             case 0:     /* M/S decoding */
>>                 for (; nsample < 256; nsample++) {
>>                     c1 = su1[band*256+nsample];
>>                     c2 = su2[band*256+nsample];
>>                     su1[band*256+nsample] = c2 * 2.0;
>>                     su2[band*256+nsample] = (c1 - c2) * 2.0;
>>                 }
>>                 break;
>>
>>             case 1:
>>                 for (; nsample < 256; nsample++) {
>>                     c1 = su1[band*256+nsample];
>>                     c2 = su2[band*256+nsample];
>>                     su1[band*256+nsample] = (c1 + c2) * 2.0;
>>                     su2[band*256+nsample] = c2 * -2.0;
>>                 }
>>                 break;
>>             case 2:
>>             case 3:
>>                 for (; nsample < 256; nsample++) {
>>                     c1 = su1[band*256+nsample];
>>                     c2 = su2[band*256+nsample];
>>                     su1[band*256+nsample] = c1 + c2;
>>                     su2[band*256+nsample] = c1 - c2;
>>                 }
>>                 break;
>>             default:
>>                 return ERROR;
> 
> the default cannot be reached, a assert(0) would be much more appropriate

Fixed.

> 
> 
>>         }
>>     }
>> }
>>
>> static void getChannelWeights (int indx, int flag, float *ch1, float *ch2)
>> {
>>     float    w1, w2;
>>
>>     if (indx == 7) {
>>         *ch1 = 1.0;
>>         *ch2 = 1.0;
>>     } else {
>>         w1 = (float)(indx & 7) * (1.0/7.0);
>>         w2 = sqrt(2.0 - (w1 * w1));
>>
>>         if (flag == 0) {
>>             *ch1 = w1;
>>             *ch2 = w2;
>>         }
>>         else {
>>             *ch1 = w2;
>>             *ch2 = w1;
>>         }
>>     }
> 
> ideg
> 
> getChannelWeights (int indx, int flag, float ch[2])[
>     ch[0] = indx / 7.0;
>     ch[1] = sqrt(2 - w1*w1);
>     if(flag)
>         FFSWAP(float, ch[0], ch[1]);
> }
> 

Fixed.

> 
>> }
>>
>> static void applyChannelWeighting (float *su1, float *su2, int *p3)
>> {
>>     int        band, nsample;
>>     float    w1_l, w1_r, w2_l, w2_r, w3_l, w3_r;
>>
>>     if (p3[1] == 7 && p3[3] == 7)
>>         return;
>>     else {
>>         getChannelWeights(p3[1], p3[0], &w1_l, &w1_r);
>>         getChannelWeights(p3[3], p3[2], &w2_l, &w2_r);
>>
>>         for(band = 1; band < 4; band++) {
>>             for(nsample = 0; nsample < 256; nsample++) {
>>                 if (nsample < 8) {
>>                     /* interpolation */
>>                     w3_l = INTERPOLATE(w1_l, w2_l, nsample);
>>                     w3_r = INTERPOLATE(w1_r, w2_r, nsample);
>>                 }
>>                 else {
>>                     w3_l = w2_l;
>>                     w3_r = w2_r;
>>                 }
>>
>>                 /* scale the channels by the weights */
>>                 su1[band*256+nsample] *= w3_l;
>>                 su2[band*256+nsample] *= w3_r;
>>             }
>>         }
>>     }
> 
> float w[2][2];
> 
> if (p3[1] != 7 || p3[3] != 7){
>     getChannelWeights(p3[1], p3[0], &w[0][0], &w[0][1]);
>     getChannelWeights(p3[3], p3[2], &w[1][0], &w[1][1]);
> 
>     for(ch=0; ch<2; ch++){
>         for(band = 1; band < 4; band++) {
>             /* scale the channels by the weights */
>             for(nsample = 0; nsample < 8; nsample++) {
>                 su[ch][band*256+nsample] *= INTERPOLATE(w[0][ch], w[1][ch], nsample);
>             }
>             for(; nsample < 256; nsample++) {
>                 su[ch][band*256+nsample] *= w[1][ch];
>             }
> 
>         }
>     }
> }
> 

Fixed.

> 
> [...]
>>     /* Swap the gain control buffers for the next frame. */
>>     pSnd->gcBlkSwitch = 1 - (pSnd->gcBlkSwitch);
> 
> pSnd->gcBlkSwitch ^= 1;
> 

Fixed.

> 
> [...]
>>         for (i = 0; i < (q->bytes_per_frame/2); i++, ptr1++, ptr2--) {
>>             tmp = *ptr1;
>>             *ptr1 = *ptr2;
>>             *ptr2 = tmp;
> 
> FFSWAP
> 

Fixed.

> 
> [...]
>>         q->arr4C[0] = q->arr4C[2];
>>         q->arr4C[1] = q->arr4C[3];
>>         q->arr4C[2] = q->arr4C[4];
>>         q->arr4C[3] = q->arr4C[5];
> 
> memmove()
> 

Fixed.

> 
> [...]
>>     /* Apply the iQMF synthesis filter. */
>>     p1 = q->outSamples;
>>     p2 = &(q->outSamples[256]);
>>     p3 = &(q->outSamples[512]);
>>     p4 = &(q->outSamples[768]);
>>
>>     for (i=0 ; i<q->channels ; i++) {
>>         iqmf (p1, p2, 256, p1, q->pUnits[i].delayBuf1, q->tempBuf, qmf_window);
>>         iqmf (p4, p3, 256, p3, q->pUnits[i].delayBuf2, q->tempBuf, qmf_window);
>>         iqmf (p1, p3, 512, p1, q->pUnits[i].delayBuf3, q->tempBuf, qmf_window);
>>         p1 += 1024;
>>         p2 += 1024;
>>         p3 += 1024;
>>         p4 += 1024;
>>     }
> 
> p1= q->outSamples;
> for (i=0 ; i<q->channels ; i++) {
>     p2= p1+256;
>     p3= p2+256;
>     p4= p3+256;
>     iqmf (p1, p2, 256, p1, q->pUnits[i].delayBuf1, q->tempBuf, qmf_window);
>     iqmf (p4, p3, 256, p3, q->pUnits[i].delayBuf2, q->tempBuf, qmf_window);
>     iqmf (p1, p3, 512, p1, q->pUnits[i].delayBuf3, q->tempBuf, qmf_window);
>     p1 +=1024;
> }
> 
> 
> [...]
>>     /* Check if we need to descramble and what buffer to pass on. */
>>     if (q->scrambled_stream) {
>>         decode_bytes(buf, q->decoded_bytes_buffer, avctx->block_align);
> 
> id feel better if you would add a check that block_align didnt change or just
> copy block_align into decoded_bytes_buffer_size and use that
> 
> 
> [...]
>>     /* Generate the scale factors. */
>>     for (i=0 ; i<64 ; i++)
>>         SFTable[i] = powf(2.0, (i + 2) / 3 - 5) * mantissaTab[(i + 2) % 3];
> 
> pow(2.0, (i + 15) / 3.0);
> 
> and remove the useless mantissaTab

The scalefactors are not even close to being equal if I do.

> 
> 
>>     /* Generate gain tables. */
>>     for (i=0 ; i<16 ; i++)
>>         gain_tab1[i] = powf (2.0, (float)(4 - i));
> 
> the cast looks useless

Fixed.

> 
> 
>>     for (i=-15 ; i<16 ; i++)
>>         gain_tab2[i+15] = powf (2.0, (float)i * -0.125);
> 
> the cast looks useless

Fixed.

> 
> 
> [...]
>> //reciprocals table
>> //     1/1.5       1/2.5       1/3.5      1/4.5       1/7.5       1/15.5       1/31.5
>> static const float iMaxQuant[8] = {
>>   0.0, 0.66666669, 0.40000001, 0.2857143, 0.22222222, 0.13333334, 0.064516127, 0.031746034
> 
> why not replace the inaccurate values by the ones which are commented out?
> 

Fixed.

> 
> [...]
> 

MvH
Benjamin Larsson
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: atrac3.c
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070410/1102aa8b/attachment.asc>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: atrac3data.h
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070410/1102aa8b/attachment.txt>