[FFmpeg-devel] [PATCH 0/4] avdevice/dshow: implement capabilities API

Fri Jun 11 16:16:51 EEST 2021

Diederick C. Niehorster (12021-06-10):
> Let me respond on two levels.
> 
> Before exploring the design space of a separation of libavdevice and
> libavformat below, I think it is important to first comment on the
> current state (and whether the AVDevice Capabilities part of my patch
> series should be blocked by this discussion).
> 
> Importantly, I would suppose that any reorganization of libavdevice
> and libavformat and redesign of the libavdevice API must aim to offer
> at least the same functionality as the current API, that is, an
> avdevice should be able to be queried for what devices it offers
> (get_device_list), should for each device provide information about
> what formats it accepts/can provide
> (create_device_capabilities/free_device_capabilities) and should be
> able to be controlled through the API (control_message). Perhaps these
> take different forms, but same functionality should be offered. As
> such, having AVDevice Capabilities API implemented for one of the
> devices should help, not hamper, redesign efforts because it shows how
> this API would actually be used in practice. Fundamental changes such
> as a new avdevice API will be backwards incompatible no matter what,
> so having one more bit of important functionality
> (create_device_capabilities/free_device_capabilities) implemented
> doesn't create a larger threshold to initiating such a redesign
> effort. Instead, it forces that all the current API functionality is
> thought out as well during the redesign effort and nothing is forgotten. I
> thus argue that its a good thing to bring back the AVDevice Capabilities
> API, since it helps, not hinders the redesign effort. And lets not
> forget it offers users of the current API functionality (me at least)
> they need now, not at some indeterminate timepoint in the future.

I mostly agree with all that. A good API merges similar things. We
should use the object-oriented approach: base APIs for everything that
handles frames-or-packets, so that generic (data copy, timestamps
update, metadata manipulation, etc.) operations can be performed with a
single code path, and then specialized derived APIs for more specific
components.

Input devices are demuxers with a few extra methods; output devices are
muxers with a few extra methods. We already have the beginning of a
class/interface hierarchy:

	formats
	  |
	  +----	muxers
	  |	  |
	  |	  +----	output devices
	  |
	  +----	demuxers
	   	  |
	   	  +----	input devices

Also, IIRC, we already have at least one protocol that does endpoint
discovery. On one hand, protocols are a separate API even from muxers
and demuxers. On the other hand, endpoint discovery is a very
device-like feature.

I take it as a sign that we should include protocols in the discussion.

> Ah ok, so this is at first instance about cleanup/separation, not
> necessarily about adding new functionality (I do see Mark's list of
> opportunities that a new API offer, copied below). I see Nicolas argue
> this entanglement of internals is not a problem in practice, and i

Almost true. We have a huge problem about the entanglement of the
libraries, but the libavformat-libavdevice aspect is a tiny part of it.
The problem is that we have eight separate libraries that depend on each
other and are developed simultaneously. Furthermore, people will always
use these libraries all at once; at worse they will not use a few of the
smaller ones. Therefore, this split brings no benefit, but it forces us
to worry about mutual compatibility of different versions of our
libraries.

Unfortunately, each time I tried to bring this issue to the discussion,
people objected with argument that betrayed a limited reflection on the
big picture and a few misconceptions about how precisely linking work,
both static and dynamic.

> suppose there is a certain amount of taste involved here. Nothing
> wrong with that. I guess for me personally that it is a little funky
> to have to add/change things in AVFormat when changing the AVDevice
> API

We frequently have to change things in libavutil to implement things in
libavcodec, or in libavcodec do implement things in libavformat or
libavdevice. These are several libraries, but a single project, a single
development intent.

> 1 and 3 i cannot speak to, but 4 is indeed what i ran into: the
> current state of most avdevices is not useful at all for an API user
> like me when it comes to capability probing (not a reason though to
> get rid of the whole API, but to wonder why it wasn't implemented.
> while nobody apparently bothered to do it before me, i think there
> will be more than just me who will actually use it). Currently I'd
> have to issue device specific options on a not-yet opened device,
> listen to the log output, parse it, etc. But the current API already
> solves this, if only it was implemented. A clear core option set would
> be nice indeed. And the AVDevice Capabilities API actually offers a
> start at that, since it lists a bunch of options that should be
> relevant to query (and set) for each device in the form of
> ff_device_capabilities (in my patchset), or av_device_capabilities
> before Andreas' patch removing it in January. I don't think its
> complete, but its a good starting point.

I agree. Thought have been given to designing this API, the efforts have
dried up before implementing the functional parts, but the design is
sound, and a good starting point to work again.

> Indeed, i am an (aspiring) API user, of the dshow device specifically,
> and possibly v4l2 later (but my project is Windows-only right now).
> Currently hampered by lack of some API not being implemented for
> dshow, hence my patch set.

And thank you for it.

I want to add that in my mind, one of the goalposts for putting
libavdevice into shape is to allow re-implementing ffplay using it. (And
possibly a symmetrical interactive recording tool, ffrecord or
something.)

> > * The libavdevice API is the libavformat API because it was originally
> > split out from libavformat, and it has the nice property that devices
> > and files end up being interchangable in some contexts.
> I can't underline enough how nice this is. My situation is simple:

I can't emphasize enough how important this is. I want to say that
people who don't see how nice this feature is, how fundamental it is in
the design of libavdevice's API, are just too incompetent about it to
participate meaningfully in the discussion yet.

> > * The libavdevice API, being the libavformat API for files, is not
> > particularly well-suited in other contexts, because devices may not
> > have the same properties as files.
> Yeah, not every field in the AVFormatxxx structs is relevant for an
> AVDevice. And some are a bit funkily named (like url to stuff the
> device name of my webcam into). But are there specific fields one
> would wish to provide for an avdevice that are currently not
> available?

I think the problem emphasized here is not really about fields, more
about the working of the API: files are read on demand, while operate
continuously, that makes a big difference.

But really, we already have this difference with network streams,
especially those that do not have flow control, for example those in
multicast. These network streams have aspects of protocols, but also
aspect of devices.

And the answer is NOT to separate libavio from libavformat: protocols
and formats mesh with each other, see the example of RTP.

> Let me make an observation though: if we would not want to lose the
> possibility to use avdevices drop-in in the place of AVFormats, some
> kind of component that has access to internals of both seems
> unavoidable.

Indeed.

And you can add: unless somebody intends to rework the code for all our
current devices to adapt it to the new API, we would also need a
compatibility wrapper for them.

In practice, that would look like this:

	application
	 → libavformat API
	    → libavdevice compatibility wrapper
	       → libavdevice API
	          → wrapper for old-style device
	             → actual device

While the useful code is just:

	application
	 → libavformat/device API
	    → actual device

That's just an insane idea, a waste of time.

> Anyway, out of Mark's options i'd vote for a separate new AVDevice
> API, and an adapter component to expose/plug in AVDevices as formats.

I do not agree. I think we need to think this globally: shape our
existing APIs into a coherent object-oriented hierarchy of
classes/interfaces. This is not limited to formats and devices, we
should include protocols in the discussion, and probably codecs and
filters too.

And to handle the fact that devices and network streams are
asynchronous, the main API needs to be asynchronous itself.

Which brings me to my project to redesign libavformat around an event
loop with callbacks.

I have moderately started working on it, by writing the documentation
for the low-level single-thread event loop API. Then I need to write the
documentation for the high-level multi-thread scheduler API. Then I can
get to coding.

>	   Without such functionality you'd need a bunch of special
> cases in your app to allow users to use devices as well.

Exactly. And that means some applications that were capable of using
some devices would lose that ability. We do not want that.

> All that said, lets not stop work on the current avdevice component
> (my patch set) while figuring out the way forward.

You are absolutely right on this last point.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20210611/f445c880/attachment.sig>