[FFmpeg-devel] Status and Plans for Subtitle Filters

Sat Feb 22 10:47:20 EET 2020

On Fri, Feb 14, 2020 at 03:26:30AM +0000, Soft Works wrote:
> Hi,
> 

Hi,

> I am looking for some guidance regarding future plans about processing subtitle streams in filter graphs.
> 
> Please correct me where I'm wrong - this is the situation as I've understood it so far:
[...]

Your analysis was pretty much on point. I've been away from FFmpeg development
from around the time of that patchset. While I can't recommend a course of
action, I can elaborate on what was blocking and missing. Beware that this is
reconstructed from my unreliable memory and I may forget important points.

Last state can be found at https://github.com/ubitux/FFmpeg/tree/subtitles-new-api

The last WIP commit includes a TODO.txt which I'm sharing here for the
record:

> TODO:
> - heartbeat mechanism
> - drop sub2video (needs heartbeat)
> - properly deal with -ss and -t (need strim filter?)
> - sub_start_display/sub_end_display needs to be honored
> - find a test case for dvbsub as it's likely broken (ffmpeg.c hack is
>   removed and should be replaced by a EAGAIN logic in lavc/utils.c)
> - make it pass FATE:
>   * fix cc/subcc
>   * broke various other stuff
> - Changelog/APIchanges
> - proper API doxy
> - update lavfi/subtitles?
> - merge [avs]null filters
> - filters doc
> - avcodec_default_get_buffer2?
> - how to transfer subtitle header down to libavfilter?

The biggest TODO entry right now is the heartbeat mechanism which is required
for being able to drop the sub2video hack. You've seen that discussed in the
thread.

Thing is, that branch is already a relatively invasive and may include
controversial API change. Typically, the way I decided to handle subtitle
text/rectangle allocation within AVSubtitle is "different" but I couldn't come
up with a better solution. Basically, we have to fit them in AVFrame for a
clean integration within FFmpeg ecosystem, but subtitles are not simple buffers
like audio and video can be: they have to be backed by more complex dynamic
structures.

Also unfortunately, addressing the problem through an iterative process is
extremely difficult in the current situation due to historical technical debt.
You may have noticed that the decode and encode subtitles API are a few
generations behind the audio and video ones. The reason it wasn't modernized
earlier was because it was already a pita in the past.

The subtitles refactor requires to see the big picture and all the problems at
once. Since the core change (subtitles in AVFrame) requires the introduction of
a new subtitles structure and API, it also involve addressing the shortcomings
of the original API (or maybe we could tolerate a new API that actually looks
like the old?). So even if we ignore the subtitle-in-avframe thing, we don't
have a clear answer for a sane API that handles everything. Here is a
non-exhaustive list of stuff that we have to take into account while thinking
about that:

- text subtitles with and without markup
- sparsity, overlapping
- different semantics for duration (duration available, no known duration,
  event-based clearing, ...)
- closed captions / teletext
- bitmap subtitles and their potential colorspaces (each rectangle as an
  AVFrame is way overkill but technically that's exactly what it is)

This should give you a hint on why the task has been quite overwhelming.
Subtitles were the reason I initially came into the multimedia world, and they
might have played a role in why I distanced myself from it.

That said, I'd say the main reason it was put in stand by was because I was
kind of alone in that struggle. While I got a lot of support from people, I
think the main help I needed would have been formalizing the API we wanted.
Like, code and API gymnastic is not that much of a problem, but deciding on
what to do, and what path we take to reach that point is/was the core issue.

And to be honest, I never really made up my mind on abandoning the work. So I'm
calling it again: if someone is interested in addressing the problem once and
for all, I can spend some time rebasing the current state and clarifying what has
been said in this mail in the details so we can work together on an API
contract we want between FFmpeg and our users. When we have this, I think
progress can be made again.

Regards,

-- 
Clément B.