[FFmpeg-devel] [PATCH 0/4] avdevice/dshow: implement capabilities API

Fri Jun 11 18:24:32 EEST 2021

Quoting Diederick C. Niehorster (2021-06-10 15:29:57)
> > The problem is that libavdevice is a separate library from libavformat,
> > but fundamentally depends on accessing libavformat internals.
> 
> Ah ok, so this is at first instance about cleanup/separation, not
> necessarily about adding new functionality

It is also about new functionality, since one of the main stated
advantages of libavdevice is that it can be used transparently by all
programs that use libavformat. New libavdevice-specific APIs
go against this.

> (I do see Mark's list of
> opportunities that a new API offer, copied below). I see Nicolas argue
> this entanglement of internals is not a problem in practice, and i
> suppose there is a certain amount of taste involved here.

Do note that Nicolas' position in library separation is quite unorthodox
--- I am not aware of anyone else supporting it, several people strongly
disagree with it. It also disagrees with our current practice.

> Nothing wrong with that. I guess for me personally that it is a little
> funky to have to add/change things in AVFormat when changing the
> AVDevice API, and that it may be good to for the longer term look at
> disentangling them. I will get back to that below, in response to some
> quotes of Mark's messages last January.
> 
> Mark's (non-exhaustive) list of opportunities a libavdevice API
> redesign offers (numbered by me):
> On 20/01/2021 12:41, Mark Thompson wrote:
>  > 1. Handle frames as well as packets.
>  >    1a. Including hardware frames - DRM objects from KMS/V4L2, D3D
> surfaces from Windows desktop duplication (which doesn't currently
> exist but should).
>  > 2. Clear core option set - currently almost everything is set by
> inconsistent private options; things like pixel/sample format,
> frame/sample rate, geometry and hardware device should be common
> options to all.
>  > 3. Asynchronicity - a big annoyance in current recording scenarios
> with the ffmpeg utility is that both audio and video capture block,
> and do so on the same thread which results in skipped frames.
>  > 4. Capability probing - the existing method of options which log
> the capabilities are not very useful for API users.
> 
> 1 and 3 i cannot speak to, but 4 is indeed what i ran into: the
> current state of most avdevices is not useful at all for an API user
> like me when it comes to capability probing (not a reason though to
> get rid of the whole API, but to wonder why it wasn't implemented.
> while nobody apparently bothered to do it before me, i think there
> will be more than just me who will actually use it). Currently I'd
> have to issue device specific options on a not-yet opened device,
> listen to the log output, parse it, etc. But the current API already
> solves this, if only it was implemented. A clear core option set would
> be nice indeed. And the AVDevice Capabilities API actually offers a
> start at that, since it lists a bunch of options that should be
> relevant to query (and set) for each device in the form of
> ff_device_capabilities (in my patchset), or av_device_capabilities
> before Andreas' patch removing it in January. I don't think its
> complete, but its a good starting point.
> 
> Mark Thompson (2021-01-25):
> > * Many of those are using it via the ffmpeg utility, but not all.
> 
> Indeed, i am an (aspiring) API user, of the dshow device specifically,
> and possibly v4l2 later (but my project is Windows-only right now).
> Currently hampered by lack of some API not being implemented for
> dshow, hence my patch set.
> 
> > * The libavdevice API is the libavformat API because it was originally
> > split out from libavformat, and it has the nice property that devices
> > and files end up being interchangable in some contexts.
> 
> I can't underline enough how nice this is. My situation is simple:
> devices such as webcams (but plenty others) may deliver video in
> various formats, including encoded. I would have to decode those to
> use them, output provided by the devices would thus have to go through
> much the same pipeline as data from video files. I already had code
> for reading in video files, so changes to also support webcams were
> absolutely minimal. However, i needed some APIs implemented to really
> round things off, make things both convenient (already the case) and
> flexible (my patch set).

I see a contradiction here. On one hand you're saying that the
usefulness of lavd comes from it having the same API as lavf. But then
you want to add a whole bunch of libavdevice-specific APIs. So any
program that wants to use them has to be specifically programmed for
libavdevice anyway. And libavformat API is saddled with extra frameworks
that are of no use to "normal" (de)muxing.

At that point, why insist on accessing lavd through the lavf API? You
can have a lavd-specific API that can cleanly export everything that is
specific to capture devices, without ugly hacks that are there
currently.

> 
> > * The libavdevice API, being the libavformat API for files, is not
> > particularly well-suited in other contexts, because devices may not
> > have the same properties as files.
> 
> Yeah, not every field in the AVFormatxxx structs is relevant for an
> AVDevice. And some are a bit funkily named (like url to stuff the
> device name of my webcam into). But are there specific fields one
> would wish to provide for an avdevice that are currently not
> available?

One thing mentioned by Mark that you cite above is that lavf is designed
around working with encoded data in the form of AVPackets, whereas for
some devices handled by lavd it would be better to use decoded frames
(or even hw surfaces) wrapped in an AVFrame.

> > * To implement devices as AVInputFormat/AVOutputFormat instances,
> > libavdevice currently needs access to the internals of libavformat.
> > * Many developers want to get rid of that dependency on libavformat
> > internals, because it creates a corresponding ugliness on the
> > libavformat side which has to leave those parts exposed in an
> > ABI-constrained way.
> 
> What specific internals does libavdevice depend on? Is it only the
> various function pointers in AVInputFormat and AVOutputFormat which
> are specific to devices, not all formats? Or is there more? I also
> understand that avdevices need to implement some of the other function
> pointers to be functional (e.g. read_header, read_packet and
> read_close), but that seems unavoidable if we'd want avdevices to be
> usable where avformats are (and again: that's a huge plus in my view).
> I also understand that the AVDevice API being exposed in the
> libavformat makes it harder to evolve the AVDevice API.

The function pointers, various private APIs, contents of
AVFormatInternal, etc. This is a pretty big deal, since it restricts
what we can do to libavformat internals without breaking ABI.

-- 
Anton Khirnov