[MPlayer-users] unicode subtitles

Artur Zaprzala zybi at talex.com.pl
Mon May 6 21:07:02 CEST 2002


Georgi Georgiev wrote:
>>Georgi Georgiev wrote:
>>
>>>TOOLS/subfont-c/encodings/charmap2enc has a line saying if (c<"80"), and I 
>>>don't understand what it is doing there. There are a lot of single-byte 
>>>encodings that use almost all the 0x100 possible values of a byte. I for 
>>>example couldn't create a koi8-r font because of that.
>>
>>charmap2enc was a simple way to support EUC encodings and `if (c<"80")' 
>>is there because mplayer with -unicode option uses similar condition to 
>>distinguish mulitbyte sequences.
> 
> 
> What about using charmap2enc when creating an encoding that is intended to be used with mplayer WITHOUT the -unicode option. As I stated in the mail before the last (just look at the double quoted text) I DID need to remove the "if (c<0x80)" clause when creating a koi8-r font. What about you guys who were doing the iso-8859-2 fonts or just about any encoding that is single-byte and has all the special symbols in the > 0x80 area.

The subtitle encoding issue seems a bit confusing, so I'll try to 
summarize it here.

There are 2 approaches:

1. (preferred) You can generate Unicode subtitles with:
	subfont --unicode <signle-byte encoding known by iconv> ...
or
	subfont --unicode <path to custom encoding file> ...
	(this custom encoding file could list all iso-8859-* characters to create 
single font file for common encodings)

and then run mplayer this way (-subcp and -utf8 expect Unicode font!):
	mplayer -subcp <any encoding known by iconv> ...
or
	mplayer -utf8 ...

2. (current) Generate subtitles for some specific encoding with:
	subfont <signle-byte encoding known by iconv> ...
or
	subfont <path to custom signle-byte or EUC encoding file> ...

and then run mplayer without any encoding options for signle-byte 
encodings, or with -unicode option for EUC (and the like) encodings 
(which is only partially implemented in mplayer).

AFAIK, CJK encodings: EUC-*, BIG5 and GB2312 work more or less this way:
- 0x8e (SINGLE-SHIFT TWO, SS2) begins a 2-byte character,
- 0x8f (SINGLE-SHIFT THREE, SS3) begins a 3-byte character,
- 0xa0-0xff begin 2-byte characters,
- other characters are single-byte.


I tested charmap2enc script only with /usr/share/i18n/charmaps/EUC-KR.gz 
(on RedHat). It wasn't intended to be perfect.


-- 
Artur Zaprzala






More information about the MPlayer-users mailing list