[MPlayer-dev-eng] Small changes to subreader.c file

Aurelien Jacobs aurel at gnuage.org
Wed Oct 12 00:27:49 CEST 2005


On Tue, 11 Oct 2005 23:42:57 +0200
Adam Tla__ka <atlka at pg.gda.pl> wrote:

> OK.
> If we talking about standards - cr/lf text files are widely used in
> text   data
> exchange between different systems and this is stated in many docs,
> for   example:

Nice examples !

> in RFC 2646  -  The Text/Plain Format Parameter - 3.  The Problem
> 
>     The Text/Plain media type is the lowest common denominator of
>     Internet email, with lines of no more than 997 characters (by
>     convention usually no more than 80), and where the CRLF sequence
>     represents a line break [MIME-IMT].

Here the RFC simply explain that CRLF only represent a simple line
break because this is not necessarily obvious.
It don't say that CRLF MUST be used !!
It don't say that LF is invalid or don't represent a line break.

> in RFC 2854  -  The 'text/html' Media Type - 4. Encoding
> considerations
> 
> As with all MIME text subtypes, the canonical form of "text/html"
>     must always represent a line break as a sequence of a CR byte
>     value (0x0D) followed by an LF (0x0A) byte value.  Similarly, any
>     occurrence of such a CRLF sequence in "text/html" must represent a
>     line break.  Use of CR byte values and LF byte values outside of
>     line break sequences is also forbidden. This rule applies
>     regardless of the character encoding ('charset') involved.
> it's a proposition ;-)

Ok, this one clearly state that CRLF is the only valid line break !
That's what should have been written in SRT spec if you wanted to
impose CRLF usage. The fact is that it's not.

> in XML definition - Extensible Markup Language (XML) 1.0 (Third
> Edition)
>   2.11 End-of-Line Handling
> 
> XML parsed entities are often stored in computer files which, for
> editing   convenience,
> are organized into lines. These lines are typically separated by some
> combination of the
> characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).
> 
> To simplify the tasks of applications, the XML processor MUST behave
> as if   it normalized
> all line breaks in external parsed entities (including the document  
> entity) on input,
> before parsing, by translating both the two-character sequence #xD #xA
> and   any #xD that is
> not followed by #xA to a single #xA character.

That one is very clear !
Either CRLF, or CR, or LF are accepted as a line break, and any XML
parser MUST be able to deal with any of this format.
That's exactly what need to be done for SRT.

> My goal is to have the subtitle output format as the most compatible
> one. If I write them to CD-R I just want to have a usable file under
> any player

Sure. And since there is already lots of SRT files in the wild using
different EOL, the only solution is to ensure any player is able read
all of those SRT files, the same way an XML parser should deal with any
kind of EOL.
Mplayer is already able to read all those different EOL types SRT files
so it don't need to be fixed.

Aurel




More information about the MPlayer-dev-eng mailing list