[FFmpeg-devel] Politics

Daniel Cantarín canta at canta.com.ar
Sat Dec 18 19:59:11 EET 2021


 >> as well as ATSC subtitles
 >
 > There are like 2 or 3 characters in each frame. Sometimes
 > they are shown as they are coming in, sometimes only
 > when a line is completed, sometimes needs to wait
 > for subsequent frames before emitting new characters.
 > This is really not a high-precision thing.

Can confirm.

I did implemented this using video filters, precisely because
we don't have subtitle filters and I was forced to do OCR. It
works OK, but it's unrealiable when speaking about timings.

You have 2 bytes/chars per frame, non-ascii chars use 2
bytes, and there are also commands that also use one or
two bytes. This leaves you with a max speed of 60 chars
per second for a 30 FPS video stream: a condition no other
subtitle/caption format that I know of share. When there's
fast dialogue, it desyncs. It later re-syncs: it doesn't drift
away. But dialogue faster than 60 chars (including
non-ascii and command) per second does affect the
captions timings.

But there's more. Implementations in players are chaotic,
and some commands are ignored or works erratically. Also,
there are line width issues: 30 chars max by the standard,
even when some players change that. But you have to deal
with word wrapping by doing text treatment and applying
commands. Then there's 4 modes of rendering, as you say,
which you are supposed to control but then players do what
they want, and given that this are bytes added to the video
frames you gotta encode two different video streams if you
want to have two different caption configurations.

It may be the most available format in the world. But it's
also a PITA because a lot of details, and that's why I'm so
interested in finally having subtitles in filters first, and
dealing with the internal details later.




More information about the ffmpeg-devel mailing list