[FFmpeg-devel] Status and Plans for Subtitle Filters

Nicolas George george at nsup.org
Thu Feb 27 00:56:19 EET 2020


Michael Niedermayer (12020-02-26):
> I do think i misunderstand something here
> because if we have a video with a signpost shown from 0:00 to 1:00
> and another shown from 0:30 to 1:30 then the subtitles translating
> or commenting that would overlap.

The existence of signs implies that overlap does happen frequently and
needs to happen gracefully. The idea of speech synthesis implies that
splitting and merging cannot be used indiscriminately. Both are true,
they do not need to happen at the same time.

Yet, they can happen at the same time, if for example spoken dialogue
meant for speech synthesis is separate (with a different ASS style or
layer) from the signs.

Furthermore, speech synthesis was just one example among many to explain
why splitting and merging is not acceptable. There are many others. The
case of timed animations has been given.

> and also the video frames showing these signposts overlap , ehm i mean
> they dont overlap. That is what i do not understand.
> Video frames dont do that and its fine
> and then theres audio
> someone playing a note on the trumpet and another a note on the piano
> again we have 2 AVFrame overlapp i mean not overlapping.
> So why subtitles ?
> 
> and one could even argue why it would make sense for audio to be
> overlapping with this information about instruments and it is in 
> midi and mod files. And a filter writing notes for the instruments
> would benefit from this and simlar a midi encoder

You're hinting at the answer. If we worked with MIDI and mod files,
splitting or merging notes would be unacceptable. Same goes for frames:
if we were a vectorial drawing program, rasterizing the graphic objects
would be unacceptable. But we're not: we consider audio is just a stream
of sample going to the speakers, and if some codec tries to do something
fancy with notes, that's its problem and we don't try to help. Same goes
for video: it's just pixels going to the screen, we don't try to
preserve sprites.

But it's not the same with subtitles. Subtitles are not just a bunch of
pixels that get overlaid on top of the video. Well, they could be, but
it's not what the users expect. Subtitles are often hand-written,
partially or completely, and read directly. A tool that mangles it would
be useless for most usages.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20200226/28049d98/attachment.sig>


More information about the ffmpeg-devel mailing list