[FFmpeg-devel] Format of decoded subtitles (was: matroska: Identify S_TEXT/UTF-8 tracks as SRT and not TEXT.)
Clément Bœsch
ubitux at gmail.com
Thu Jun 7 09:39:46 CEST 2012
On Thu, May 31, 2012 at 05:07:33PM +0200, Nicolas George wrote:
> Le sextidi 6 prairial, an CCXX, Clément Bœsch a écrit :
> > So if I understand well, you would propose a model with libsubconvert
> > doing any kind of markup conversion instead of the current model where the
> > decoder is "encoding" the event in ASS, bitmap or text?
>
> Well, it does not need to be a separate library per se, but I really think
> we need some kind of:
>
> ctx = avsub_markup_convert_init(ASS, HTML);
> avsub_markup_convert(ctx, sub_ass, sub_html);
>
> or something.
Maybe, still I'm not sure how this would help, but feel free to propose
something.
>
> > It should, for text-based subtitles. At least for the "useful" markup. But
> > I admit ASS has some annoying limitations, especially with some particular
> > subtitles features:
> >
> > - the first one I have in mind is that there is no text representation
> > for the "last up to the next subtitles" feature. Example: MicroDVD (and
> > SAMI which I'm working on ATM) have features like this:
> >
> > {500}{600}this is printed starting at frame 500 and last until frame 600
> > {1234}{}this starts being displayed at frame 1234...
> > {1400}{}...and will be "replaced" by this text until the end.
> >
> > We can express this in the AVPacket (pkt.duration = -1 for example),
> > but to encode the ASS event, it's not possible to have 00:01:02:03
> > -1:-1:-1:-1 for instance. So we need to workaround this.
>
> I am not sure I follow you: this is not markup, this is timing, and IMHO,
> timing belongs in the demuxer and should be decoded by it. For the example,
> the demuxer should output packets like that:
>
> { .pts = 500, .duration = 100, .data = "this is printed starting..." },
> { .pts = 1234, .duration = 166, .data = "this starts being displayed..." },
> { .pts = 1400, .duration = PTS_MAX - 1400, .data = "...and will..." },
>
Yes this was a timing issue, which I indeed solved in the demuxer context
(see 2d52ee8a1a4f9438df90f3c95a6fbfc8f6e812f3). But this kind of
"workaround" could have been put at another level; for instance there is a
similar issue with SAMI: the next subtitle replaces the previous one
(there is no duration field or something), and thus you always need to
demux two packets at a time, buffer one, etc (while we could just have put
a duration = -1 in the packet).
> > - One random limitation against SAMI: this insane HTML-based format
> > (actually not HTML at all, but full CSS2 compliant...), has two
> > subtitles place holders. Basically it's two subtitles in one (one to
> > print the talker name, and one for what's being said), relying on
> > various presentation markup expectation which ASS can't honor (I don't
> > want to try converting <table> into ASS markup for example).
> >
> > - Other crazy, but of limited usefulness: <img> tag in SAMI (yes...) or
> > even in JACOSub.
>
> Even if they are crazy and we will never support them for rendering, we need
> to support them for encoding and decoding and stream copy. Therefore, I do
> not believe we can use ASS as an universal markup.
>
> The pseudo-HTML of SRT, OTOH, can pretty well be converted into ASS and
> back.
>
> But considering ASS, I am quite unsure about what part of the line should go
> into the decoded text. IMHO, "Start" and "End" should not (they are timing,
> not markup), but the other fields affect the markup.
>
See my other comment on the other thread.
> > - Last one is the precision limitation we already talked about (tb 1/100
> > for ASS, and 1/1000 for ones like SRT).
>
> Again, timing, not markup.
>
Yup sorry I dispersed.
[...]
--
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120607/e67a8e67/attachment.asc>
More information about the ffmpeg-devel
mailing list