[FFmpeg-devel] [PATCH 6/6 v2] avformat/movenc: add support for Immersive Audio Model and Formats in ISOBMFF

James Almer jamrial at gmail.com
Mon Feb 5 19:25:23 EET 2024


On 2/5/2024 2:06 PM, Andreas Rheinhardt wrote:
> James Almer:
>> On 2/5/2024 1:18 PM, Andreas Rheinhardt wrote:
>>> James Almer:
>>>> On 2/5/2024 12:28 PM, Andreas Rheinhardt wrote:
>>>>> James Almer:
>>>>>> On 2/5/2024 12:12 PM, Andreas Rheinhardt wrote:
>>>>>>> James Almer:
>>>>>>>> On 2/3/2024 11:50 AM, Andreas Rheinhardt wrote:
>>>>>>>>>> diff --git a/libavformat/movenc.h b/libavformat/movenc.h
>>>>>>>>>> index 60363198c9..fee3e759e0 100644
>>>>>>>>>> --- a/libavformat/movenc.h
>>>>>>>>>> +++ b/libavformat/movenc.h
>>>>>>>>>> @@ -25,7 +25,9 @@
>>>>>>>>>>       #define AVFORMAT_MOVENC_H
>>>>>>>>>>         #include "avformat.h"
>>>>>>>>>> +#include "iamf.h"
>>>>>>>>>>       #include "movenccenc.h"
>>>>>>>>>> +#include "libavcodec/bsf.h"
>>>>>>>>>
>>>>>>>>> There is no need to include these here, as you don't need complete
>>>>>>>>> types. This has the added benefit of forcing you to actually
>>>>>>>>> include
>>>>>>>>> the
>>>>>>>>> files where you are using them (namely in movenc.c, where you
>>>>>>>>> forgot to
>>>>>>>>> include bsf.h).
>>>>>>>>
>>>>>>>> Ok, fixed locally.
>>>>>>>>
>>>>>>>> Will push the set soon.
>>>>>>>
>>>>>>> It seems you have not noticed my objection to the first version of
>>>>>>> your set.
>>>>>>>
>>>>>>> - Andreas
>>>>>>
>>>>>> Can you link to it?
>>>>>
>>>>> Sorry, it was v2:
>>>>> https://ffmpeg.org/pipermail/ffmpeg-devel/2024-February/320722.html
>>>>>
>>>>> - Andreas
>>>>
>>>> I removed the codec list from the split bsf like you asked, and
>>>> explained what the bsfs do in the documentation.
>>>
>>> For those few codecs where different framings are common and supported
>>> by us, the muxers convert the given framing to the needs of the output
>>> format; decoders also support the various framings. This of course only
>>> works if they can decide which packetization the input uses; it is
>>> possible for the cases we support.
>>>
>>> If you allow that packets can contain OBU encapsulated data for
>>> arbitrary codec ids (even if only intended for a few of them), then this
>>> packetization would become officially allowed and we would have to adapt
>>> our muxers and decoders accordingly. Which is just not possible, because
>>> there is just no information available to base this decision on.
>>
>> So you want me to state that these bsfs should not be used at all by
>> library users, and that they are meant exclusively to be inserted by the
>> mov muxer and demuxer?
>>
> 
> Actually, I don't think that this should be done in a BSF at all. For
> the reasons outlined above.

I have to disagree. I'm using the packet filtering API to filter packets.
Saying that it shouldn't be a bsf because it's mainly useful for 
internal libav* modules is not a good argument. External users can use 
it too if they want to, seeing there are lavc users other than lavf, and 
other mp4 demuxers could benefit from it. And we have bsfs, like 
vp9_superframe_split, that have virtually no use whatsoever for callers 
other than our native decoders (even though other decoders could benefit 
from it too).

These are packetizer and depacketizer filters. We're not standardizing 
any kind of encapsulation for specific codecs other modules will have 
too look for.

> 
>>>
>>> There is a second complication with iamf_frame_split_bsf: Up until now,
>>> BSFs only passed the stream index value through. But with this BSF the
>>> output may have multiple ids even when the input only had one. I am
>>> pretty sure that this will surprise and break many users. I don't know
>>> whether ffmpeg is among them (if a user inserts the BSF manually).
>>
>> This would be addressed by forbidding (or declaring unsupported) the
>> usage of the filters by external callers.
>>
> 
> So a BSF for only one caller (lavf)?
> 
>>>
>>> In fact, for this BSF the stream_index that the output packet gets is
>>> determined by the offset as well as the packet data alone. The only way
>>> for the demuxer to know these numbers is if it has already parsed the
>>> packet data before and added streams according to this. Looking at 3/6
>>
>> I think you mean 4/6.
>>
>>> it seems that this is indeed what's happening (which is wasteful). But
>>
>> Packets are not being parsed, only the descriptors in the relevant mp4
>> sample entry.
>>
> 
> And it happens twice, which is wasteful.

Muxer and bsf are separate independent modules. It's expected to be the 
case.

> 
>>> only partially: The iamf_descriptors data is checked and streams are
>>> created according to it, but the data read via av_append_packet() is not
>>> checked at all. What guarantees that it can't contain
>>> IAMF_OBU_IA_AUDIO_ELEMENT elements (which trigger adding new ids and
>>> therefore lead to an increment in the potential output stream_index)?
>>> Also notice that your 3/6 uses the pkt's stream_index coming out of the
>>> BSF without any check.
>>
>> Again 4/6. And i can add a check for stream_index.
>>
> 
> I think we should never come in a scenario where this can happen.
> 
>> I could make the split filter only parse descriptors once, and passing
>> them to it immediately after av_bfs_init(), so if packets have
>> descriptors, an error will be returned (The spec disallows descriptors
>> in samples).
>> There's also the bsf's input codecpar extradata. I could use that instead.
> 
> If one were to use a BSF for this, then sending the descriptor via
> extradata would be the way to go.
> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".


More information about the ffmpeg-devel mailing list