[FFmpeg-devel] Status and Plans for Subtitle Filters

Sat Feb 22 13:30:52 EET 2020

On Sat, Feb 22, 2020 at 10:59:46AM +0000, Soft Works wrote:
[...]
> Reading through the discussion around your patch was discouraging, 
> even destructive in some parts. I understand why you felt alone with that
> and I wonder why nobody else chimed in. I mean, sometimes there are 
> extensive discussions about some of the least important video formats in 
> the world, while subtitles are a pretty fundamental thing...

I think the main reason is that subtitles are a different beast in the
multimedia world, and most people intuitively understand this is not fun
work at all. It's much more comfortable to work with audio and video since
the framework design revolves around them.

> On the other hand - playing devil's advocate: Why even handle a subtitle 
> media type in filtergraphs?
> 

It's not only about lavfi: the whole framework works with AVFrame. If you
use something else, you'll have to duplicate most of the APIs to handle
subtitles as well. In the past, audio was actually separated, and
unification with video was a relief. Going another path for subtitles is
going to be extremely invasive, verbose, and annoying to maintain on API
change.

> Would there be any filters at all that would operate on subtitles?
>  (other than rendering to a video surface)

Sure. A few ideas that come to my mind:

- rasterization (text subtitles to bitmap subtitles)
- ocr (bitmap subtitles to text)
- all kind of text processing (eventually piped to some external tools)
- censoring bad words
- inserting "watermark" text
- timing processing: trimming, shift, scaling of time
- lorem ipsum or similar "source" filter (equivalent to our video mires)
  for testing purposes
- audio to text for auto captioning
- text to audio for audio synthesis
- concat multiple subtitle files (think of a multiple episode merged into
  one, and you want to do the same for subtitles)
- merge/overlap multiple subtitle tracks (think of multi-language
  subtitles)

[...]
> But when the primary purpose of having subtitles in filtergraphs would be 
> to have them eventually converted to bitmaps, and given that it's really so 
> extremely difficult and controversial to implement this, plus that there
> seems to be only moderate support for this from other developers- 
> could it possibly be an easier and more pragmatic solution to convert
> the subtitles to images simply before they are entering the filtergraph?

That means it's likely to be only available within the command line tool
and not the API. Unless you design a separated "libavsubtitle" (discussed
in the past several times), but you'll need at some point many interfaces
with the usual demuxing-decoding-encoding-muxing pipeline.

Regards,

-- 
Clément B.