[MPlayer-users] Bugreport: Messed up charset in a file identification

Ivan Doležal ivan.dolezal at vsb.cz
Tue Jul 15 17:27:38 CEST 2003



Hello,

    encouraged by Mr. Kenzelmann at mplayer-users saying the code 
doesn't contain an internationalization (see below) I'd like to report a 
bug:


   I want to get metadata (name, author, copyright) of a WMA stream. 
Unfortunately, when the metadata are displayed, a Czech diacritics is 
completely messed up. Some Czech characters are replaced with other, 
nosense characters from Latin alphabet (Ř into Y for example) so there 
is no way to hack that with some tr/// . A famous Czech testing sentence 
that is a Clip name: "PŘÍLIŠ ŽLUŤOUČKÝ KŮŇ ÚPĚL ĎÁBELSKÉ ÓDY"  Author: 
"příliš žluťoučký kůň úpěl ďábelské ódy" has been destroyed into...

$ mplayer -nocache -identify 
mms://server1.streaming.cesnet.cz/others/ols/czech.wma | hexdump -c

[...snip...]

0000660   n   a   m   e   :       P   X 315   L   I   `       }   L   U
0000670   d   O   U  \f   K 335       K   n   G     332   P 032   L
0000680 016 301   B   E   L   S   K 311     323   D   Y  \n       a   u
0000690   t   h   o   r   :       p   Y 355   l   i   a       ~   l   u
00006a0   e   o   u  \r   k 375       k   o   H     372   p 033   l
00006b0 017 341   b   e   l   s   k 351     363   d   y  \n       c   o
00006c0   p   y   r   i   g   h   t   :      \n       c   o   m   m   e


... no matter how LANG/LC_ALL was set...

$ export LC_ALL=cs_CZ.cp1250 ; export LANG=cs_CZ.cp1250
$ export LC_ALL=cs_CZ.iso88592 ; export LANG=cs_CZ.iso88592
$ export LC_ALL=cs_CZ.utf8 ; export LANG=cs_CZ.utf8
$ export LC_ALL=en_US.iso88591 ; export LANG=en_US.iso88591
$ export LC_ALL=en_US.utf8 ; export LANG=en_US.utf8

(all the mentioned settings are available with locale -a )

When I try to play the stream in the Windows Media Player, the 
characters are displayed correctly.


It would be extremely useful for me if the mplayer displayed the output 
in any known charset.

If this is not a bug, could you, please, give me any good advice/hack on 
how to get the typed text in any well-known character coding? I need to 
do some small job fast... please... I need it just for Czech language 
(one kind of encoding)...


Thank you in advance for your reply!

Ivan Dolezal






Technical details follow:




All tests were done on out-of-the-box Red Hat 9 and prepackaged mplayer:

$ rpm -qa | grep mpl
mplayer-0.90-1
mplayer-codecs-win32-1.0-2
mplayer-codecs-win32-mjpeg2k-1.0-2
mplayer-common-0.90-1
mplayer-codecs-win32-qt-6.0-2
mplayer-codecs-linux-real-9.0-2
mplayer-codecs-win32-dmo-9.0-2
mplayer-codecs-win32-qtextras-1.0-2
mplayer-font-iso2-1.0-3
mplayer-codecs-linux-xanim-1.0-2


$ mplayer --version

Using GNU internationalization
Original domain: messages
Original dirname: /usr/share/locale
Current domain: mplayer
Current dirname: /usr/share/locale


MPlayer 0.90-RPM-3.1 (C) 2000-2003 Arpad Gereoffy (see DOCS)

CPU: Intel Pentium MMX P55C (new) (Family: 5, Stepping: 1)
Detected cache-line size is 32 bytes
CPUflags:  MMX: 1 MMX2: 0 3DNow: 0 3DNow2: 0 SSE: 0 SSE2: 0
Compiled with Runtime CPU Detection - WARNING - this is not optimal!
To get best performance, recompile MPlayer with 
--disable-runtime-cpudetection
Reading config file /etc/mplayer/mplayer.conf
Reading config file /home/dol72/.mplayer/config


$ uname -a
Linux localhost.localdomain 2.4.20-8 #1 Thu Mar 13 16:42:56 EST 2003 
i586 i586 i386 GNU/Linux


$ ls -l /lib/libc[.-]*
-rwxr-xr-x    1 root     root      1465640 Mar 14 00:30 /lib/libc-2.3.2.so
lrwxrwxrwx    1 root     root           13 Jun 13 16:46 /lib/libc.so.6 
-> libc-2.3.2.so


...just for fun...

$ /sbin/lsmod
[...snip...]
Module                  Size  Used by    Not tainted
nls_iso8859-2           4060   0  (unused)
nls_iso8859-1           3484   1  (autoclean)
nls_cp1250              4572   1  (autoclean)



and here is the full output:

$ export LC_ALL=en_US.iso88591
$ mplayer -nocache -identify 
mms://server1.streaming.cesnet.cz/others/ols/czech.wma | hexdump -c
0000000   U   s   i   n   g       G   N   U       i   n   t   e   r   n
0000010   a   t   i   o   n   a   l   i   z   a   t   i   o   n  \n   O
0000020   r   i   g   i   n   a   l       d   o   m   a   i   n   :
0000030   m   e   s   s   a   g   e   s  \n   O   r   i   g   i   n   a
0000040   l       d   i   r   n   a   m   e   :       /   u   s   r   /
0000050   s   h   a   r   e   /   l   o   c   a   l   e  \n   C   u   r
0000060   r   e   n   t       d   o   m   a   i   n   :       m   p   l
0000070   a   y   e   r  \n   C   u   r   r   e   n   t       d   i   r
0000080   n   a   m   e   :       /   u   s   r   /   s   h   a   r   e
0000090   /   l   o   c   a   l   e  \n  \n  \n   M   P   l   a   y   e
00000a0   r       0   .   9   0   -   R   P   M   -   3   .   1       (
00000b0   C   )       2   0   0   0   -   2   0   0   3       A   r   p
00000c0   a   d       G   e   r   e   o   f   f   y       (   s   e   e
00000d0       D   O   C   S   )  \n  \n   C   P   U   :       I   n   t
00000e0   e   l       P   e   n   t   i   u   m       M   M   X       P
can't open '/home/dol72/.mplayer/codecs.conf': No such file or directory
00000f0   5   5   C       (   n   e   w   )       (   F   a   m   i   l
0000100   y   :       5   ,       S   t   e   p   p   i   n   g   :
0000110   1   )  \n   D   e   t   e   c   t   e   d       c   a   c   h
0000120   e   -   l   i   n   e       s   i   z   e       i   s       3
0000130   2       b   y   t   e   s  \n   C   P   U   f   l   a   g   s
0000140   :           M   M   X   :       1       M   M   X   2   :
0000150   0       3   D   N   o   w   :       0       3   D   N   o   w
0000160   2   :       0       S   S   E   :       0       S   S   E   2
0000170   :       0  \n   C   o   m   p   i   l   e   d       w   i   t
0000180   h       R   u   n   t   i   m   e       C   P   U       D   e
0000190   t   e   c   t   i   o   n       -       W   A   R   N   I   N
00001a0   G       -       t   h   i   s       i   s       n   o   t
00001b0   o   p   t   i   m   a   l   !  \n   T   o       g   e   t
00001c0   b   e   s   t       p   e   r   f   o   r   m   a   n   c   e
00001d0   ,       r   e   c   o   m   p   i   l   e       M   P   l   a
00001e0   y   e   r       w   i   t   h       -   -   d   i   s   a   b
00001f0   l   e   -   r   u   n   t   i   m   e   -   c   p   u   d   e
0000200   t   e   c   t   i   o   n  \n   R   e   a   d   i   n   g
0000210   c   o   n   f   i   g       f   i   l   e       /   e   t   c
0000220   /   m   p   l   a   y   e   r   /   m   p   l   a   y   e   r
0000230   .   c   o   n   f  \n   R   e   a   d   i   n   g       c   o
0000240   n   f   i   g       f   i   l   e       /   h   o   m   e   /
0000250   d   o   l   7   2   /   .   m   p   l   a   y   e   r   /   c
0000260   o   n   f   i   g  \n   R   e   a   d   i   n   g       /   h
0000270   o   m   e   /   d   o   l   7   2   /   .   m   p   l   a   y
0000280   e   r   /   c   o   d   e   c   s   .   c   o   n   f   :
0000290   R   e   a   d   i   n   g       /   e   t   c   /   m   p   l
00002a0   a   y   e   r   /   c   o   d   e   c   s   .   c   o   n   f
00002b0   :       5   0       a   u   d   i   o       &       1   3   6
00002c0       v   i   d   e   o       c   o   d   e   c   s  \n   f   o
00002d0   n   t   :       c   a   n   '   t       o   p   e   n       f
00002e0   i   l   e   :       /   h   o   m   e   /   d   o   l   7   2
00002f0   /   .   m   p   l   a   y   e   r   /   f   o   n   t   /   f
Linux RTC init error in ioctl (rtc_irqp_set 1024): Permission denied
0000300   o   n   t   .   d   e   s   c  \n   F   o   n   t       /   u
0000310   s   r   /   s   h   a   r   e   /   m   p   l   a   y   e   r
0000320   /   f   o   n   t   /   f   o   n   t   .   d   e   s   c
0000330   l   o   a   d   e   d       s   u   c   c   e   s   s   f   u
0000340   l   l   y   !       (   2   1   0       c   h   a   r   s   )
0000350  \n   T   r   y       a   d   d   i   n   g       "   e   c   h
0000360   o       1   0   2   4       >       /   p   r   o   c   /   s
0000370   y   s   /   d   e   v   /   r   t   c   /   m   a   x   -   u
0000380   s   e   r   -   f   r   e   q   "       t   o       y   o   u
0000390   r       s   y   s   t   e   m       s   t   a   r   t   u   p
00003a0       s   c   r   i   p   t   s   .  \n   U   s   i   n   g
00003b0   u   s   l   e   e   p   (   )       t   i   m   i   n   g  \n
Can't open input config file /home/dol72/.mplayer/input.conf : No such 
file or directory
00003c0   I   n   p   u   t       c   o   n   f   i   g       f   i   l
00003d0   e       /   e   t   c   /   m   p   l   a   y   e   r   /   i
00003e0   n   p   u   t   .   c   o   n   f       p   a   r   s   e   d
00003f0       :       5   2       b   i   n   d   s  \n  \n   P   l   a
0000400   y   i   n   g       m   m   s   :   /   /   s   e   r   v   e
0000410   r   1   .   s   t   r   e   a   m   i   n   g   .   c   e   s
0000420   n   e   t   .   c   z   /   o   t   h   e   r   s   /   o   l
0000430   s   /   c   z   e   c   h   .   w   m   a  \n   R   e   s   o
0000440   l   v   i   n   g       s   e   r   v   e   r   1   .   s   t
0000450   r   e   a   m   i   n   g   .   c   e   s   n   e   t   .   c
0000460   z       .   .   .  \n   C   o   n   n   e   c   t   i   n   g
0000470       t   o       s   e   r   v   e   r       s   e   r   v   e
0000480   r   1   .   s   t   r   e   a   m   i   n   g   .   c   e   s
0000490   n   e   t   .   c   z   [   1   9   5   .   1   1   3   .   1
Connect error : Connection refused
00004a0   6   1   .   9   9   ]   :   8   0       .   .   .  \n   R   e
00004b0   s   o   l   v   i   n   g       s   e   r   v   e   r   1   .
00004c0   s   t   r   e   a   m   i   n   g   .   c   e   s   n   e   t
00004d0   .   c   z       .   .   .  \n   C   o   n   n   e   c   t   i
00004e0   n   g       t   o       s   e   r   v   e   r       s   e   r
00004f0   v   e   r   1   .   s   t   r   e   a   m   i   n   g   .   c
0000500   e   s   n   e   t   .   c   z   [   1   9   5   .   1   1   3
0000510   .   1   6   1   .   9   9   ]   :   1   7   5   5       .   .
0000520   .  \n   c   o   n   n   e   c   t   e   d  \n   f   i   l   e
0000530       o   b   j   e   c   t   ,       p   a   c   k   e   t
0000540   l   e   n   g   t   h       =       1   5   1   6       (   1
0000550   5   1   6   )  \n   u   n   k   n   o   w   n       o   b   j
0000560   e   c   t  \n   s   t   r   e   a   m       o   b   j   e   c
0000570   t   ,       s   t   r   e   a   m       i   d   :       1  \n
0000580   u   n   k   n   o   w   n       o   b   j   e   c   t  \n   u
0000590   n   k   n   o   w   n       o   b   j   e   c   t  \n   d   a
00005a0   t   a       o   b   j   e   c   t  \n   m   m   s   t       p
00005b0   a   c   k   e   t   _   l   e   n   g   t   h       =       1
00005c0   5   1   6  \n   C   a   c   h   e       s   i   z   e       s
00005d0   e   t       t   o       0       K   B   y   t   e   s  \n   C
00005e0   o   n   n   e   c   t   e   d       t   o       s   e   r   v
00005f0   e   r   :       s   e   r   v   e   r   1   .   s   t   r   e
0000600   a   m   i   n   g   .   c   e   s   n   e   t   .   c   z  \n
0000610   S   t   r   e   a   m       n   o   t       s   e   e   k   a
0000620   b   l   e   !  \n   A   S   F       f   i   l   e       f   o
0000630   r   m   a   t       d   e   t   e   c   t   e   d   .  \n   S
0000640   t   r   e   a   m       n   o   t       s   e   e   k   a   b
0000650   l   e   !  \n   C   l   i   p       i   n   f   o   :  \n
0000660   n   a   m   e   :       P   X   Í   L   I   `       }   L   U
0000670   d   O   U  \f   K   Ý       K   n   G       Ú   P 032   L
0000680 016   Á   B   E   L   S   K   É       Ó   D   Y  \n       a   u
0000690   t   h   o   r   :       p   Y   í   l   i   a       ~   l   u
00006a0   e   o   u  \r   k   ý       k   o   H       ú   p 033   l
00006b0 017   á   b   e   l   s   k   é       ó   d   y  \n       c   o
00006c0   p   y   r   i   g   h   t   :      \n       c   o   m   m   e
00006d0   n   t   s   :      \n   =   =   =   =   =   =   =   =   =   =
00006e0   =   =   =   =   =   =   =   =   =   =   =   =   =   =   =   =
*
0000720  \n   O   p   e   n   i   n   g       a   u   d   i   o       d
0000730   e   c   o   d   e   r   :       [   f   f   m   p   e   g   ]
0000740       F   F   m   p   e   g   /   l   i   b   a   v   c   o   d
0000750   e   c       a   u   d   i   o       d   e   c   o   d   e   r
0000760   s  \n   A   U   D   I   O   :       4   4   1   0   0       H
0000770   z   ,       2       c   h   ,       1   6       b   i   t
0000780   (   0   x   1   0   )   ,       r   a   t   i   o   :       8
0000790   0   0   5   -   >   1   7   6   4   0   0       (   6   4   .
00007a0   0       k   b   i   t   )  \n   S   e   l   e   c   t   e   d
00007b0       a   u   d   i   o       c   o   d   e   c   :       [   f
00007c0   f   w   m   a   v   2   ]       a   f   m   :   f   f   m   p
00007d0   e   g       (   D   i   v   X       a   u   d   i   o       v
00007e0   2       (   f   f   m   p   e   g   )   )  \n   =   =   =   =
00007f0   =   =   =   =   =   =   =   =   =   =   =   =   =   =   =   =
*
0000830   =   =   =   =   =   =  \n   I   D   _   F   I   L   E   N   A
0000840   M   E   =   m   m   s   :   /   /   s   e   r   v   e   r   1
0000850   .   s   t   r   e   a   m   i   n   g   .   c   e   s   n   e
0000860   t   .   c   z   /   o   t   h   e   r   s   /   o   l   s   /
0000870   c   z   e   c   h   .   w   m   a  \n   I   D   _   A   U   D
0000880   I   O   _   C   O   D   E   C   =   f   f   w   m   a   v   2
0000890  \n   I   D   _   A   U   D   I   O   _   F   O   R   M   A   T
00008a0   =   3   5   3  \n   I   D   _   A   U   D   I   O   _   B   I
00008b0   T   R   A   T   E   =   6   4   0   4   0  \n   I   D   _   A
00008c0   U   D   I   O   _   R   A   T   E   =   4   4   1   0   0  \n
00008d0   I   D   _   A   U   D   I   O   _   N   C   H   =   2  \n   I
00008e0   D   _   L   E   N   G   T   H   =   3   0   9  \n  \n  \n   E
00008f0   x   i   t   i   n   g   .   .   .       (   E   n   d       o
0000900   f       f   i   l   e   )  \n
0000908


-------- Original Message --------
Subject: Re: [MPlayer-users] Messed up charset in a file identification
Date: 15 Jul 2003 16:55:15 +0200
From: Daniel Kenzelmann <kenzelma at stud.uni-frankfurt.de>
To: Ivan Doležal <ivan.dolezal at vsb.cz>
References: <3F13C7A2.3030500 at vsb.cz>	 
<1058273337.1464.8.camel at ddd.whgl.uni-frankfurt.de> 
<3F140B81.20609 at vsb.cz>	 <3F14111D.4030604 at vsb.cz>

On Tue, 2003-07-15 at 16:35, Ivan Doležal wrote:


btw, i checked and yes, the output is nowhere near any sane encoding.
Maybe the part of the code which delivers that information isn't
internationalized .. keep on telling the developers .. seems like a real
bug.





More information about the MPlayer-users mailing list