[FFmpeg-devel] [PATCH 1/7 v2] avcodec: add an Immersive Audio Model and Formats frame split bsf

Wed Jan 31 00:07:44 EET 2024

On 1/30/2024 6:47 PM, Andreas Rheinhardt wrote:
>> +    *obu_size = get_leb(&gb);
> This stuff here should not a GetBitContext at all, as basically
> everything is byte-aligned (and the flags above are in known bits).

I'm not going to write yet another leb() reading function to work on raw 
bytes. We have enough scattered around and in fact we should try to 
remove most.

>> +static const enum AVCodecID iamf_stream_split_codec_ids[] = {
>> +    AV_CODEC_ID_PCM_S16LE, AV_CODEC_ID_PCM_S16BE,
>> +    AV_CODEC_ID_PCM_S24LE, AV_CODEC_ID_PCM_S24BE,
>> +    AV_CODEC_ID_PCM_S32LE, AV_CODEC_ID_PCM_S32BE,
>> +    AV_CODEC_ID_OPUS,      AV_CODEC_ID_AAC,
>> +    AV_CODEC_ID_FLAC,      AV_CODEC_ID_NONE,
>> +};
>> +
>> +const FFBitStreamFilter ff_iamf_stream_split_bsf = {
>> +    .p.name         = "iamf_stream_split",
>> +    .p.codec_ids    = iamf_stream_split_codec_ids,
>> +    .p.priv_class   = &iamf_stream_split_class,
>> +    .priv_data_size = sizeof(IAMFSplitContext),
>> +    .init           = iamf_stream_split_init,
>> +    .flush          = iamf_stream_split_flush,
>> +    .close          = iamf_stream_split_close,
>> +    .filter         = iamf_stream_split_filter,
>> +};
> 
> This needs to add documentation for what this BSF is actually supposed
> to do. Right now it seems crazy: It parses the packet's data and expects
> to find OBU headers, although the input data is supposed to be PCM,
> Opus, AAC or Flac.

It's not too different than aac_adtstoasc in that it takes audio from 
those codecs listed above encapsulated in one form and returns it in 
another form.
In this case, it takes OBUs containing one or more audio frames, removes 
the OBU encapsulation, and propagates each raw audio frame in separate 
packets.

I'll write some documentation.