[FFmpeg-devel] Status and Plans for Subtitle Filters

Sat Feb 22 12:59:46 EET 2020

> -----Original Message-----
> From: Clément Bœsch <u at pkh.me>
> Sent: Saturday, February 22, 2020 9:47 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Cc: Soft Works <softworkz at hotmail.com>
> Subject: Re: [FFmpeg-devel] Status and Plans for Subtitle Filters
> 
> On Fri, Feb 14, 2020 at 03:26:30AM +0000, Soft Works wrote:
> > Hi,
> >
> 
> Hi,
> 
> > I am looking for some guidance regarding future plans about processing
> subtitle streams in filter graphs.
> >
> > Please correct me where I'm wrong - this is the situation as I've understood
> it so far:
> [...]
> 
> Your analysis was pretty much on point. I've been away from FFmpeg
> development from around the time of that patchset. While I can't
> recommend a course of action, I can elaborate on what was blocking and
> missing. Beware that this is reconstructed from my unreliable memory and I
> may forget important points.
> 
> Last state can be found at
> https://github.com/ubitux/FFmpeg/tree/subtitles-new-api
> 
> The last WIP commit includes a TODO.txt which I'm sharing here for the
> record:
> 
> > TODO:
> > - heartbeat mechanism
> > - drop sub2video (needs heartbeat)
> > - properly deal with -ss and -t (need strim filter?)
> > - sub_start_display/sub_end_display needs to be honored
> > - find a test case for dvbsub as it's likely broken (ffmpeg.c hack is
> >   removed and should be replaced by a EAGAIN logic in lavc/utils.c)
> > - make it pass FATE:
> >   * fix cc/subcc
> >   * broke various other stuff
> > - Changelog/APIchanges
> > - proper API doxy
> > - update lavfi/subtitles?
> > - merge [avs]null filters
> > - filters doc
> > - avcodec_default_get_buffer2?
> > - how to transfer subtitle header down to libavfilter?
> 
> The biggest TODO entry right now is the heartbeat mechanism which is
> required for being able to drop the sub2video hack. You've seen that
> discussed in the thread.
> 
> Thing is, that branch is already a relatively invasive and may include
> controversial API change. Typically, the way I decided to handle subtitle
> text/rectangle allocation within AVSubtitle is "different" but I couldn't come
> up with a better solution. Basically, we have to fit them in AVFrame for a
> clean integration within FFmpeg ecosystem, but subtitles are not simple
> buffers like audio and video can be: they have to be backed by more
> complex dynamic structures.
> 
> Also unfortunately, addressing the problem through an iterative process is
> extremely difficult in the current situation due to historical technical debt.
> You may have noticed that the decode and encode subtitles API are a few
> generations behind the audio and video ones. The reason it wasn't
> modernized earlier was because it was already a pita in the past.
> 
> The subtitles refactor requires to see the big picture and all the problems at
> once. Since the core change (subtitles in AVFrame) requires the introduction
> of a new subtitles structure and API, it also involve addressing the
> shortcomings of the original API (or maybe we could tolerate a new API that
> actually looks like the old?). So even if we ignore the subtitle-in-avframe
> thing, we don't have a clear answer for a sane API that handles everything.
> Here is a non-exhaustive list of stuff that we have to take into account while
> thinking about that:
> 
> - text subtitles with and without markup
> - sparsity, overlapping
> - different semantics for duration (duration available, no known duration,
>   event-based clearing, ...)
> - closed captions / teletext
> - bitmap subtitles and their potential colorspaces (each rectangle as an
>   AVFrame is way overkill but technically that's exactly what it is)
> 
> This should give you a hint on why the task has been quite overwhelming.
> Subtitles were the reason I initially came into the multimedia world, and they
> might have played a role in why I distanced myself from it.
> 
> That said, I'd say the main reason it was put in stand by was because I was
> kind of alone in that struggle. While I got a lot of support from people, I think
> the main help I needed would have been formalizing the API we wanted.
> Like, code and API gymnastic is not that much of a problem, but deciding on
> what to do, and what path we take to reach that point is/was the core issue.
> 
> And to be honest, I never really made up my mind on abandoning the work.
> So I'm calling it again: if someone is interested in addressing the problem
> once and for all, I can spend some time rebasing the current state and
> clarifying what has been said in this mail in the details so we can work
> together on an API contract we want between FFmpeg and our users. When
> we have this, I think progress can be made again.

Thanks a lot for taking the time to respond!

I'm and on-/off reader of ffmpeg-devel for a few years, making some small 
contributions, most of which are regularly being ignored what in turn made
me refrain from submitting patches of larger scale.

Reading through the discussion around your patch was discouraging, 
even destructive in some parts. I understand why you felt alone with that
and I wonder why nobody else chimed in. I mean, sometimes there are 
extensive discussions about some of the least important video formats in 
the world, while subtitles are a pretty fundamental thing...

On the other hand - playing devil's advocate: Why even handle a subtitle 
media type in filtergraphs?

Would there be any filters at all that would operate on subtitles?
 (other than rendering to a video surface)

What kind of subtitle filters would make sense?
- An AllCaps or ToLower filter?
- A filter to modulate font size based on audio volume?
- A "Force Font-of-theDay" filter?

I'm sure it will be possible to find a few more reasonable examples.. ;-)

But when the primary purpose of having subtitles in filtergraphs would be 
to have them eventually converted to bitmaps, and given that it's really so 
extremely difficult and controversial to implement this, plus that there
seems to be only moderate support for this from other developers- 
could it possibly be an easier and more pragmatic solution to convert
the subtitles to images simply before they are entering the filtergraph?

softworkz