[FFmpeg-devel] [PATCH] avcodec/aacdec: don't force HE-AACv2 profile if no PS info is present

James Almer jamrial at gmail.com
Tue Aug 2 20:48:47 EEST 2022


On 7/22/2022 8:10 PM, James Almer wrote:
> On 7/22/2022 8:00 PM, Alex Converse wrote:
>> On Fri, Jul 22, 2022 at 8:37 AM James Almer <jamrial at gmail.com> wrote:
>>> On 7/22/2022 12:14 PM, Andreas Rheinhardt wrote:
>>>> James Almer:
>>>>> On 7/22/2022 11:56 AM, Andreas Rheinhardt wrote:
>>>>>> James Almer:
>>>>>>> On 7/22/2022 11:23 AM, Andreas Rheinhardt wrote:
>>>>>>>> James Almer:
>>>>>>>>> On 7/18/2022 10:57 AM, Andreas Rheinhardt wrote:
>>>>>>>>>> James Almer:
>>>>>>>>>>> On 7/14/2022 9:10 AM, Andreas Rheinhardt wrote:
>>>>>>>>>>>> James Almer:
>>>>>>>>>>>>> Should fix ticket #3361
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> This also needs an update to some fate ref samples i'll upload
>>>>>>>>>>>>> before
>>>>>>>>>>>>> pushing
>>>>>>>>>>>>> (fate-aac-al_sbr_ps_04_ur and fate-aac-al_sbr_ps_06_ur which
>>>>>>>>>>>>> are now
>>>>>>>>>>>>> decoded
>>>>>>>>>>>>> properly as he_aac mono, so the .s16 files need to be 
>>>>>>>>>>>>> replaced).
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>
>> [snip]
>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> it seems at least for these samples the fixed decoder does not
>>>>>>>>> generate
>>>>>>>>> a decoded stream comparable to the float one, so I'll just 
>>>>>>>>> upload a
>>>>>>>>> new
>>>>>>>>> raw pcm file.
>>>>>>>>
>>>>>>>> When I decode both of these streams with git master, the left
>>>>>>>> channel is
>>>>>>>> pretty much identical, yet the right channel of the fixed-point 
>>>>>>>> decoder
>>>>>>>> is silent and the right channel of the floating point decoder is 
>>>>>>>> not.
>>>>>>>> With this patch applied, the result are two mono streams that are
>>>>>>>> pretty
>>>>>>>> much identical: The test sample created by the floating-point 
>>>>>>>> decoder
>>>>>>>> works with the fixed-point decoder test (if one uncomments and 
>>>>>>>> modifies
>>>>>>>> the latter). So the issue with aac-al_sbr_ps_06_ur is not a 
>>>>>>>> reason to
>>>>>>>> upload new samples.
>>>>>>>
>>>>>>> Ok, can you suggest how to add a test that decodes with the fixed 
>>>>>>> point
>>>>>>> decoder then compares that with the output of the float decoder? Is
>>>>>>> there a helper in fate.sh already for this?
>>>>>>>
>>>>>>
>>>>>> There is currently no helper in fate-run.sh for this.
>>>>>
>>>>> Yeah, figures that's the case. Can you suggest how one would work 
>>>>> for this?
>>>>>
>>>>
>>>> A new function in fate-run.sh seems to be necessary. Given that a
>>>> similar situation exists for AC-3 we should not hardcode aac; 
>>>> instead it
>>>> should have two parameters for the float and the fixed decoder. Then it
>>>> should decode the input file twice and do the same comparison that the
>>>> current tests use (they use the oneoff method, which results in
>>>> do_tiny_psnr with MAXDIFF being called).
>>>> I think the tests for the fixed-point decoder (with checksums) 
>>>> should be
>>>> separate, so that one can test this without the floating-point decoder
>>>> being present.
>>>>
>>>>>>
>>>>>>>>
>>>>>>>> - Andreas
>>>>>>>>
>>>>>>>> PS: libfdk-aac produces a file that looks pretty much like the 
>>>>>>>> floating
>>>>>>>> point decoder from git master. Are you sure your patch is correct?
>>>>>>>
>>>>>>> Yes, they duplicate the single channel in the stream and output 
>>>>>>> it as
>>>>>>> stereo, something that should be done by a filter if that's what the
>>>>>>> user wants. Decoding a mono sample should generate a mono stream.
>>>>>>
>>>>>> Not really. The channels are different.
>>>>>
>>>>> ./ffmpeg -c:a libfdk_aac -i ../samples/aac/al_sbr_ps_04_new.mp4 -af
>>>>> channelmap=channel_layout=mono:map=0 -f md5 -
>>>>>
>>>>> has the same result as
>>>>>
>>>>> ./ffmpeg -c:a libfdk_aac -i ../samples/aac/al_sbr_ps_04_new.mp4 -af
>>>>> channelmap=channel_layout=mono:map=1 -f md5 -
>>>>>
>>>>> Same with the samples in the ticket.
>>>>>
>>>>
>>>> This seems to be true for al_sbr_ps_04_new.mp4; but it is not true for
>>>> al_sbr_ps_06_new.mp4.
>>>
>>> So looks like nearly a hundred samples into al_sbr_ps_06_new.mp4 frames
>>> start containing PS info. With this patch the decoder properly decodes
>>> the first hundred as mono, but since the decoder is locked, it will keep
>>> decoding the stereo samples as mono.
>>>
>>
>> Hey all,
>>
>> I thought I should share a little bit of context about this problem,
>> but I don't mean to come back from nowhere and try to overrule you
>> all. Do what you decide is best.
>>
>> An HE-AACv2 decoder treating unsignaled mono as stereo is an
>> intentional design trade-off that the MPEG audio committee made. It is
>> a tradeoff that the FFmpeg decoder has mimicked for a number of years.
>> If you want to revisit the trade-off (and there may very well be good
>> reasons to do that) that's fine, but I think treating the current
>> behavior as a "bug" is the wrong approach.
>>
>> In fact, those fate tests are based on a Coding Technologies test
>> suite designed to validate a decoder conformed to the MPEG behavior.
>>
>> Parametric Stereo is nested inside the SBR extension after the main
>> SBR payload which itself is nested inside the AAC raw data block after
>> the main channel elements. It takes a full bitstream parse of both the
>> AAC and SBR layers and finding an SBR intra-frame to even see if PS is
>> present.
>>
>>
>> As for why I think MPEG made this trade-off (my opinion of why they 
>> did this) :
>> - It enables cutting (or joining a stream of) audio at arbitrary
>> frames without losing PS content or stereo detection.
>> - Most devices with mono output can support downmixing from stereo.
>> - Down mixing stereo to mono can be ugly with regard to phase but it
>> doesn't have nearly the complexity of the taste/environment/judgment
>> factor that surround sound to stereo downmix does.
>> - In transcode scenarios, most output codecs can support encoding
>> stereo formatted audio where both channels are identical quite
>> efficiently (even in AAC-LC with mid-side coding the overhead for
>> mono-coded as stereo is a single digit number of bits per frame).
>> - On playback on a stereo device it doesn't matter if the decoded
>> signal is one channel played out of both speakers or two channels that
>> are identical.
>> - This behavior can be overridden with more complicated signaling the
>> extradata (but this requires a transport that supports such signaling
>> and doesn't simply wrap ADTS).
>> - The folks working on iterating "MPEG-2 NBC" into the "MPEG-4 Audio"
>> monstrosity were primarily focused on getting their new features used.
> 
> Thanks a lot for clarifying this. This "bug" has been pending for nearly 
> a decade now...
> 
> So i withdraw this patch and I'll close the ticket as invalid. I'll then 
> see if adding a downmix input option is feasible, but the user could 
> just request to downmix to mono by a filter down the chain (which is 
> what the audiomatch tests do), so maybe it's not that useful.

Do you know for that matter where in the spec is this defined? I see 
fdk-aac also does the same, so it's clearly intentional, but where is it 
specified?


More information about the ffmpeg-devel mailing list