[FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix handling of backslashes

Soft Works softworkz at hotmail.com
Sat Feb 5 01:24:58 EET 2022



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Oneric
> Sent: Friday, February 4, 2022 10:52 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix
> handling of backslashes
> 
> On Fri, Feb 04, 2022 at 02:30:37 +0100, Andreas Rheinhardt wrote:
> > All text-based subtitles are supposed to be UTF-8 when they reach
> the
> > decoder; if it isn't, the user has to set the appropriate -
> sub_charenc
> > and -sub_charenc_mode.
> >
> > - Andreas
> 
> Thanks for the info! Then at least the UTF-8 assumption
> is no problem after all.
> 
> 

[..]

> >
> > I'm not sure whether all ffmpeg text-sub encoders can handle
> > those chars - which could be verified of course.
> 
> Since it's in the BMP and ffmpeg already seems happy to assume some
> UTF-8
> support by converting everything to it, I'm not worried about this
> until
> proven wrong.

Proven wrong: https://github.com/libass/libass/issues/507


> > Finally, those chars are a pest. I'm using them myself for a
> > specific use case, but when you don't know they are there, it can
> > drive you totally mad, eventually even thinking your system or
> > software is faulty.
> >
> > Example:
> >
> > Open your patch file [2/2] and search for the string
> > "123456\NAscending". You can see the string in two lines, but search
> > will only find one of them.
> >
> > Or just look at the two lines directly. They are preceded by + and -
> > even though both appear identical.
> 
> Actually, I see this with helpful colouring lost here:
> 
>   -Dialogue: 0,0:00:55.00,0:01:00.00,Default,,0,0,0,,Descending:
> 123456\NAscending: 123456^M
>   +Dialogue: 0,0:00:55.00,0:01:00.00,Default,,0,0,0,,Descending:
> <200f>123456<200e>\NAscending: 123456^M

I didn't say you won't be a able to find a viewer that can display 
them. :-)


> More plain-text oriented editors likely won't show them though, yes.

Yes => pest


> > That might be true, but I think it's valid to say that such
> characters
> > are very unusual "original" subtitle sources and that's why I don't
> > think it's a good idea for ffmpeg to start injecting them.
> 
> Don't underestimate what subtitle authors can come up with :)

Sure. But a subtitle author is responsible for their authored
subtitles while ffmpeg is responsible for encoding of large
part of the world's subtitles.


And from that same perspective I find the relation of this 
proposal somewhat insane:

You want to "pollute" gazillions of subtitle streams in the 
world from multiple subtitle formats with invisible 
characters in order to solve an escaping problem in ffmpeg?

softworkz





More information about the ffmpeg-devel mailing list