[FFmpeg-devel] [PATCH v3] ffmpeg CLI multithreading

Wed Dec 6 19:31:27 EET 2023

Hi all,

As usual when someone disagrees with him, Nicolas converged to being
utterly unreasonable and deaf to all arguments. I see no point in
discussing this with him any further and intend to push the set
tomorrow, unless somebody else has substantial objections.

I've considered asking for a TC vote, but as he is not suggesting any
viable alternative, there is really nothing to vote on. So the purpose
of this email is just summarizing the dispute, so others can understand
it more easily.

The issue concerns sub2video code, which allows converting bitmap
subtitles to video streams, mainly for hardsubbing purposes. As
subtitles are typically sparse in time, the demuxer that produces the
subtitle stream emits "heartbeat" frames that are sent to the
filtering code just like real subtitle frames.

The code in ffmpeg_filter.c then decides whether these heartbeat frames
should be sent to the filtergraph or ignored. The problem is that this
decision is currently made in a way that depends on what frames
previously arrived on _other_ filtergraph inputs (e.g. on video frames
in a graph with a subtitle and a video input). However, the inputs are
not synchronized, and the interleaving of frames on different inputs is
effectively arbitrary. E.g. it depends on the video decoder delay (and
thus on the number of frame threads, when frame threading is used).

The reason this arbitrariness has not become a major issue until now, is
that it is deterministic for a given run on a given machine (sub2video
FATE tests do not use a frame-threaded decoder, and so do not exhibit
the problem). With ffmpeg CLI becoming fully parallel, the results
become non-deterministic and change from run to run, which forces me to
do something about this.

My solution in patch 01/10 changes the filtering code to always send the
heartbeat frames to the filtergraph. This not only makes the results
deterministic, but also improves subtitle timing in FATE tests.

Nicolas presented a testcase that consists of taking a video+subtitle
streams from a single source, offsetting them against each other by a
fixed delay, and overlaying subtitle onto video. After my patch, this
results in the filtergraph buffering a number of heartbeat frames
proportional to the offset, which causes higher memory consumption.

However,
* the testcase suffers from the above problem - its output can change
  significantly depending on the number of decoder frame threads; this
  is fixed by my patch;
* the extra buffering added by the patch is similar to what would be
  added by the muxer interleaving queue, were the streams remuxed rather
  than overlaid;
* the buffering can be avoided entirely by opening the input twice.

I thus consider his argument (that the patch is "breaking" the testcase)
invalid, as the testcase is
* contrived;
* already broken;
* actually fixed by my patch.

Nicolas has also NOT suggested any viable alternative approach.

-- 
Anton Khirnov