[FFmpeg-devel] [PATCH 4/9] avutil: introduce an Immersive Audio Model and Formats API
James Almer
jamrial at gmail.com
Thu Nov 30 15:01:16 EET 2023
On 11/30/2023 8:01 AM, Anton Khirnov wrote:
> Quoting James Almer (2023-11-26 02:28:53)
>> diff --git a/libavutil/iamf.h b/libavutil/iamf.h
>> new file mode 100644
>> index 0000000000..1f4919efdb
>> --- /dev/null
>> +++ b/libavutil/iamf.h
>> +enum AVIAMFAudioElementType {
>> + AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL,
>> + AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE,
>
> 'audio' in the names is redundant and makes already long identifiers
> unnecessarily longer
I'm trying to keep everything namespaced. Audio Elements are not the
only part of the spec to use "element". See Submixes.
>
>> +};
>> +
>> +/**
>> + * @defgroup lavf_iamf_params Parameter Definition
>> + * @{
>> + * Parameters as defined in section 3.6.1 and 3.8
>
> of what?
Should i link https://aomediacodec.github.io/iamf/ somewhere?
>
>> +/**
>> + * Mix Gain Parameter Data as defined in section 3.8.1
>> + *
>> + * Subblocks in AVIAMFParamDefinition use this struct when the value or
>> + * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is
>> + * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN.
>> + */
>> +typedef struct AVIAMFMixGainParameterData {
>
> Does 'ParameterData' at the end really serve any purpose?
Follow the names as in
https://aomediacodec.github.io/iamf/#obu-parameterblock
I can change it to Parameters or Params, or just remove it.
>
>> + const AVClass *av_class;
>> +
>> + // AVOption enabled fields
>> + unsigned int subblock_duration;
>> + enum AVIAMFAnimationType animation_type;
>> + AVRational start_point_value;
>> + AVRational end_point_value;
>> + AVRational control_point_value;
>> + unsigned int control_point_relative_time;
>
> All these should really be documented. Also, some vertical alignment
> would improve readability.
>
>> +/**
>> + * Parameters as defined in section 3.6.1
>
> This really REALLY needs more documentation.
Yes, was keeping better documentation for last.
>
>> + */
>> +typedef struct AVIAMFParamDefinition {
>> + const AVClass *av_class;
>> +
>> + size_t subblocks_offset;
>> + size_t subblock_size;
>> +
>> + enum AVIAMFParamDefinitionType param_definition_type;
>> + unsigned int num_subblocks;
>
> We use nb_foo generally.
For these public fields i'm keeping the same name as they are in the
spec. I use nb_foo for arrays of structs in the demuxer/muxer patches.
But i can change it if you prefer.
>
>> +AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType param_definition_type,
>> + AVDictionary **options,
>> + unsigned int num_subblocks, AVDictionary **subblock_options,
>
> What are the dicts for?
Setting AVOptions for the AVIAMFParamDefinition and each subblock,
respectively.
>
>> + *
>> + * When audio_element_type is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, this
>> + * corresponds to an Scalable Channel Layout layer as defined in section 3.6.2.
>> + * For AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, it is an Ambisonics channel
>> + * layout as defined in section 3.6.3
>> + */
>> +typedef struct AVIAMFLayer {
>> + const AVClass *av_class;
>> +
>> + // AVOption enabled fields
>> + AVChannelLayout ch_layout;
>> +
>> + unsigned int recon_gain_is_present;
>
> Every time you dedicate 4 bytes to storing one bit, God kills a kitten.
I'll shave a few bytes.
>
>> + /**
>> + * Output gain flags as defined in section 3.6.2
>
> It would be really really nice if people could understand the struct
> contents without some external document.
>
>> + * This field is defined only if audio_element_type is
>
> presumably the parent's audio_element_type
Yes, forgot the @ref. Good catch.
>
>> + * AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, must be 0 otherwise.
>> + */
>> + unsigned int output_gain_flags;
>> + /**
>> + * Output gain as defined in section 3.6.2
>> + *
>> + * Must be 0 if @ref output_gain_flags is 0.
>> + */
>> + AVRational output_gain;
>> + /**
>> + * Ambisonics mode as defined in section 3.6.3
>> + *
>> + * This field is defined only if audio_element_type is
>> + * AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, must be 0 otherwise.
>> + *
>> + * If 0, channel_mapping is defined implicitly (Ambisonic Order)
>> + * or explicitly (Custom Order with ambi channels) in @ref ch_layout.
>> + * If 1, @ref demixing_matrix must be set.
>> + */
>> + enum AVIAMFAmbisonicsMode ambisonics_mode;
>> +
>> + // End of AVOption enabled fields
>
> What purpose does this comment serve?
It was a reminder for me of what could be set through AVOptions. I'll
remove it.
>
>> + /**
>> + * Demixing matrix as defined in section 3.6.3
>> + *
>> + * Set only if @ref ambisonics_mode == 1, must be NULL otherwise.
>> + */
>> + AVRational *demixing_matrix;
>
> Who sets this?
lavf for demuxing and the user for muxing, as usual. Will mention it.
>
>> +typedef struct AVIAMFAudioElement {
>> + const AVClass *av_class;
>> +
>> + AVIAMFLayer **layers;
>> + /**
>> + * Number of layers, or channel groups, in the Audio Element.
>> + * For audio_element_type AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, there
>> + * may be exactly 1.
>> + *
>> + * Set by av_iamf_audio_element_add_layer(), must not be
>> + * modified by any other code.
>> + */
>> + unsigned int num_layers;
>> +
>> + unsigned int codec_config_id;
>
> ???
Ah, good catch. Need to remove this and adapt the muxer.
>
>> +int av_iamf_audio_element_add_layer(AVIAMFAudioElement *audio_element, AVDictionary **options);
>
> I would much prefer to have the caller call av_opt_set* manually rather
> than sprinkle AVDictionary function arguments everywhere.
> Do note that their usage in lavc and lavf APIs is out of necessity, not
> because it's very pretty.
Alright, will try to remove it from most of these.
More information about the ffmpeg-devel
mailing list