[FFmpeg-devel] [PATCH 001/244] Add a new channel layout API

Anton Khirnov anton at khirnov.net
Tue Jan 7 20:34:20 EET 2020


Quoting Nicolas George (2019-12-31 16:17:49)
> Anton Khirnov (12019-12-29):
> > In the API namespace (function names) or the parameter names? For the
> > latter, it can be changed at any time without problem and I don't really
> > care much. For the former, the header is called channel_layout and I'd
> > lean towards keeping that aligned with the namespace. Not a very strong
> > opinion though.
> 
> I meant both, but of course the names of the public symbols (and the name
> of the header, which is not set in marble) are what we are stuck with
> and requires careful planning. I do not insist on it, but it is
> something to consider.
> 
> In fact, I just noticed that you used chlayout at some places in the
> public API instead of channel_layout.

Yes, fields in structs where channel_layout already exists. I don't see
what else can be done there.

> 
> > Hmm, this was apparently added by Vittorio so I'm not sure, but would
> > assume it refers to AV_CHAN_SILENCE.
> 
> It will need to be clarified before it's done.

Right, explanation added.

> 
> > I am aware that it is different, but making the struct dynamic would
> > make using it significantly more cumbersome. Given how long we have
> > lived with the bitmask, I do not expect there to be a significant need
> > to add more fields to it.
> 
> The bitmask was constraining, but it was very simple: we can accept to
> be constrained as the cost of simplicity. This new API is not very
> simple: if it is constraining, the it is not worth it.

This API is the simplest way I could think of that achieves the desired
goals (bundling the channel count+layout together, allowing arbitrary
channel counts, ambisonic,...). Most things doable with the current API
are just as simple in the new one.

> 
> > Plus, it can still be extended by adding a new AV_CHANNEL_ORDER types
> > and a corresponding new union member.
> 
> Not all extensions can be done like that.
> 
> > Given that a copy of this struct is embedded in every single frame, I'd
> > rather not add extensive dynamically allocated metadata to it. Those
> > should rather appear in some higher level API.
> 
> What higher level API? Maybe we need to reconsider embedding the
> structure in every single frame, and possibly using reference counting
> instead.

Reference counting means dynamic allocation and a whole bunch of extra
complixity. I believe it is not worth it.

> 
> > They also can (and are, in the patchset) used as
> > (AVChannelLayout)AV_CHANNEL_LAYOUT_FOO
> > 
> > We might want to add a macro for that.
> 
> Ok. Or just mention it in the doc.
> 
> > That still requires error checks and adds considerable complexity.
> 
> This is not true, please have a look at the API I have proposed, since I
> have posted it on the mailing-list:
> 
> https://ffmpeg.org/pipermail/ffmpeg-devel/2019-December/254901.html
> 
> As you can see, all the complexity is inside the implementation. For the
> code that uses the API, it actually makes things simpler.
> 
> In particular, with your version, error checking must be done twice:
> once in av_channel_layout_describe() and once when calling it. With
> AVWriter, the first one is unnecessary.

And the caller gets an extra opportunity to receive an invalid truncated
result if he doesn't go out of his way to check the result.

> 
> > There is no hiding from the fact that the string size is not bounded, so
> > either you make up an arbitrary limit and hope it's enough (and have it
> > fail for special cases), or you do dynamic allocation with error checks.
> 
> All that is taken care of by the API and hidden from the user.
> 
> > Yes. The point of the API is to provide a logical mapping between
> > audio channels and their indices in a stream. That way, the callers that
> > only read the channel layout do not (typically) need to know how exactly
> > it is stored.
> > 
> > This function is the canonical way of answering the question 'what
> > semantics does n-th channel in this stream have?', so the callers don't
> > need to handle ORDER_NATIVE vs ORDER_CUSTOM, or do any messing with
> > bitmasks.
> 
> Ok. But my point is: is the question "what semantics does n-th channel
> in this stream have?" really relevant?
> 
> (Similar with strings: the question "what is the n-th character" is
> hardly ever relevant, but people keep asking it and implementing APIs to
> answer it.)

What questions about a channel layout would you consider relevant then?

A channel layout IS - by definition - a mapping from stream indices to
semantics. 'What semantics does the n-th channel have' is one of just
two fundamental questions you can ask about a layout (the other one is
'how many channels are there'), everything else is fluff.

> 
> > I do not agree. Duplicated channels in a layout are expected to be a
> > fringe thing and how you handle them highly depends on the specific use
> > case. I expect a typical caller will want to disregard that possibility
> > and just take the first channel of each semantics.
> > So I do not believe a dedicated function for this makes sense. We could
> > always add something later though, if it turns out to be necessary.
> 
> I think you are making a mistake. I think that as soon as it will be
> technically possible, we will see cases with duplicated channels. And I
> know that some filters will do exactly that as soon as they are ported
> to this new API.
> 
> Therefore, I insist, we need a clean user interface to handle duplicated
> channels.

Why would such filters exist and why would we accept them? I do not see
how can there be a clean user interface for a broken and undefined use
case.

> 
> > Again, yes. E.g. you are decoding an audio stream with multiple channels
> > and want to know which is the 'front left' one. This function gives you
> > the canonical answer to that.
> 
> Same as above: I do not insist, but I doubt that "which one is the front
> left" is a very interesting question.

See above.

-- 
Anton Khirnov


More information about the ffmpeg-devel mailing list