[FFmpeg-devel] [PATCH 3/5] lavf: replace FFERROR_REDO with AVERROR(EAGAIN)

Nicolas George george at nsup.org
Wed Nov 9 14:42:24 EET 2022


Anton Khirnov (12022-11-09):
> I have no idea what you mean by this. What is being broken and how?

EAGAIN and REDO are different. If you make them the same, you break one
of them. Or possibly both if you are being really stupid, but at least
one.

> Maybe more context will clarify the motivation here. What became this
> patchset started as an attempt to make all uses of EAGAIN/REDO in
> lavf/lavd consistent.

And as I explained, it is not possible.

>			While doing that I gradually came to the
> conclusion that the distinction is arbitrary - e.g. multiple demuxers
> deliberately return EAGAIN as context switch method. Now we could
> declare this practice invalid, but as this has been done in multiple
> demuxers independently, and has been in the tree for over 10 years, so
> this should not be done lightly.

I introduced REDO precisely to fix demuxers who abused EAGAIN like that.

> The question then is - what should be the effective rule for deciding
> whether a demuxer returns EAGAIN or REDO in any given case?
> Opinions on this are very much welcome.

I do not have an opinion, I have knowledge.

The semantic of EAGAIN has always been, since it stabilized, that the
operation cannot be completed immediately, it requires an external
event, like a network packet or an IRQ on a device. Therefore, the only
correct reaction to EAGAIN is to delay retrying. Ideally delay until an
event is received; if that cannot be achieved a small arbitrary delay is
still infinitely better than busy wait.

This is standard Unix system and network programming, knowing this in
and out is IMO an absolute prerequisite for tinkering with FFmpeg's
network code.

The semantic of AVERROR(EAGAIN) is ours to decide, but it would be
really really stupid from us to decide for something that is not aligned
with the wider semantic of EAGAIN. And aligning with EAGAIN is what we
did, with the caveat that most of it only works with devices and
elementary protocols. Except for the bugs at hand.

Now, as you observed, some demuxers have abused AVERROR(EAGAIN) to let
the framework re-call them and continue their work, for example looking
for a resync marker.

This is an abuse, this is invalid. If AVERROR(EAGAIN) is handled
correctly, using it to look for a resync marker causes a delay to be
inserted. The delay is very small, negligible in most cases, but it
should not be there. And if the resync marker is hard to find, the delay
will add up.

If I remember correctly, this is precisely the issue AVERROR_REDO fixes:
a damaged sample that would be very slow to demux without consuming CPU
because of the delays introduced by the correct handling of
AVERROR(EAGAIN).

There was another way to fix the issue: have the demuxers loop
themselves looking for a resync. But it would have required more
reworking of the code.

Last detail: we do not have the API to return to the user in the middle
of the search for a resync marker. If this is something you want for
some reason, you will need to introduce something new. But if so, please
first explain your use case.

-- 
  Nicolas George


More information about the ffmpeg-devel mailing list