[MPlayer-users] Bugreport: Messed up charset in a file identification
Ivan Doležal
ivan.dolezal at vsb.cz
Tue Jul 15 17:27:38 CEST 2003
Hello,
encouraged by Mr. Kenzelmann at mplayer-users saying the code
doesn't contain an internationalization (see below) I'd like to report a
bug:
I want to get metadata (name, author, copyright) of a WMA stream.
Unfortunately, when the metadata are displayed, a Czech diacritics is
completely messed up. Some Czech characters are replaced with other,
nosense characters from Latin alphabet (Ř into Y for example) so there
is no way to hack that with some tr/// . A famous Czech testing sentence
that is a Clip name: "PŘÍLIŠ ŽLUŤOUČKÝ KŮŇ ÚPĚL ĎÁBELSKÉ ÓDY" Author:
"příliš žluťoučký kůň úpěl ďábelské ódy" has been destroyed into...
$ mplayer -nocache -identify
mms://server1.streaming.cesnet.cz/others/ols/czech.wma | hexdump -c
[...snip...]
0000660 n a m e : P X 315 L I ` } L U
0000670 d O U \f K 335 K n G 332 P 032 L
0000680 016 301 B E L S K 311 323 D Y \n a u
0000690 t h o r : p Y 355 l i a ~ l u
00006a0 e o u \r k 375 k o H 372 p 033 l
00006b0 017 341 b e l s k 351 363 d y \n c o
00006c0 p y r i g h t : \n c o m m e
... no matter how LANG/LC_ALL was set...
$ export LC_ALL=cs_CZ.cp1250 ; export LANG=cs_CZ.cp1250
$ export LC_ALL=cs_CZ.iso88592 ; export LANG=cs_CZ.iso88592
$ export LC_ALL=cs_CZ.utf8 ; export LANG=cs_CZ.utf8
$ export LC_ALL=en_US.iso88591 ; export LANG=en_US.iso88591
$ export LC_ALL=en_US.utf8 ; export LANG=en_US.utf8
(all the mentioned settings are available with locale -a )
When I try to play the stream in the Windows Media Player, the
characters are displayed correctly.
It would be extremely useful for me if the mplayer displayed the output
in any known charset.
If this is not a bug, could you, please, give me any good advice/hack on
how to get the typed text in any well-known character coding? I need to
do some small job fast... please... I need it just for Czech language
(one kind of encoding)...
Thank you in advance for your reply!
Ivan Dolezal
Technical details follow:
All tests were done on out-of-the-box Red Hat 9 and prepackaged mplayer:
$ rpm -qa | grep mpl
mplayer-0.90-1
mplayer-codecs-win32-1.0-2
mplayer-codecs-win32-mjpeg2k-1.0-2
mplayer-common-0.90-1
mplayer-codecs-win32-qt-6.0-2
mplayer-codecs-linux-real-9.0-2
mplayer-codecs-win32-dmo-9.0-2
mplayer-codecs-win32-qtextras-1.0-2
mplayer-font-iso2-1.0-3
mplayer-codecs-linux-xanim-1.0-2
$ mplayer --version
Using GNU internationalization
Original domain: messages
Original dirname: /usr/share/locale
Current domain: mplayer
Current dirname: /usr/share/locale
MPlayer 0.90-RPM-3.1 (C) 2000-2003 Arpad Gereoffy (see DOCS)
CPU: Intel Pentium MMX P55C (new) (Family: 5, Stepping: 1)
Detected cache-line size is 32 bytes
CPUflags: MMX: 1 MMX2: 0 3DNow: 0 3DNow2: 0 SSE: 0 SSE2: 0
Compiled with Runtime CPU Detection - WARNING - this is not optimal!
To get best performance, recompile MPlayer with
--disable-runtime-cpudetection
Reading config file /etc/mplayer/mplayer.conf
Reading config file /home/dol72/.mplayer/config
$ uname -a
Linux localhost.localdomain 2.4.20-8 #1 Thu Mar 13 16:42:56 EST 2003
i586 i586 i386 GNU/Linux
$ ls -l /lib/libc[.-]*
-rwxr-xr-x 1 root root 1465640 Mar 14 00:30 /lib/libc-2.3.2.so
lrwxrwxrwx 1 root root 13 Jun 13 16:46 /lib/libc.so.6
-> libc-2.3.2.so
...just for fun...
$ /sbin/lsmod
[...snip...]
Module Size Used by Not tainted
nls_iso8859-2 4060 0 (unused)
nls_iso8859-1 3484 1 (autoclean)
nls_cp1250 4572 1 (autoclean)
and here is the full output:
$ export LC_ALL=en_US.iso88591
$ mplayer -nocache -identify
mms://server1.streaming.cesnet.cz/others/ols/czech.wma | hexdump -c
0000000 U s i n g G N U i n t e r n
0000010 a t i o n a l i z a t i o n \n O
0000020 r i g i n a l d o m a i n :
0000030 m e s s a g e s \n O r i g i n a
0000040 l d i r n a m e : / u s r /
0000050 s h a r e / l o c a l e \n C u r
0000060 r e n t d o m a i n : m p l
0000070 a y e r \n C u r r e n t d i r
0000080 n a m e : / u s r / s h a r e
0000090 / l o c a l e \n \n \n M P l a y e
00000a0 r 0 . 9 0 - R P M - 3 . 1 (
00000b0 C ) 2 0 0 0 - 2 0 0 3 A r p
00000c0 a d G e r e o f f y ( s e e
00000d0 D O C S ) \n \n C P U : I n t
00000e0 e l P e n t i u m M M X P
can't open '/home/dol72/.mplayer/codecs.conf': No such file or directory
00000f0 5 5 C ( n e w ) ( F a m i l
0000100 y : 5 , S t e p p i n g :
0000110 1 ) \n D e t e c t e d c a c h
0000120 e - l i n e s i z e i s 3
0000130 2 b y t e s \n C P U f l a g s
0000140 : M M X : 1 M M X 2 :
0000150 0 3 D N o w : 0 3 D N o w
0000160 2 : 0 S S E : 0 S S E 2
0000170 : 0 \n C o m p i l e d w i t
0000180 h R u n t i m e C P U D e
0000190 t e c t i o n - W A R N I N
00001a0 G - t h i s i s n o t
00001b0 o p t i m a l ! \n T o g e t
00001c0 b e s t p e r f o r m a n c e
00001d0 , r e c o m p i l e M P l a
00001e0 y e r w i t h - - d i s a b
00001f0 l e - r u n t i m e - c p u d e
0000200 t e c t i o n \n R e a d i n g
0000210 c o n f i g f i l e / e t c
0000220 / m p l a y e r / m p l a y e r
0000230 . c o n f \n R e a d i n g c o
0000240 n f i g f i l e / h o m e /
0000250 d o l 7 2 / . m p l a y e r / c
0000260 o n f i g \n R e a d i n g / h
0000270 o m e / d o l 7 2 / . m p l a y
0000280 e r / c o d e c s . c o n f :
0000290 R e a d i n g / e t c / m p l
00002a0 a y e r / c o d e c s . c o n f
00002b0 : 5 0 a u d i o & 1 3 6
00002c0 v i d e o c o d e c s \n f o
00002d0 n t : c a n ' t o p e n f
00002e0 i l e : / h o m e / d o l 7 2
00002f0 / . m p l a y e r / f o n t / f
Linux RTC init error in ioctl (rtc_irqp_set 1024): Permission denied
0000300 o n t . d e s c \n F o n t / u
0000310 s r / s h a r e / m p l a y e r
0000320 / f o n t / f o n t . d e s c
0000330 l o a d e d s u c c e s s f u
0000340 l l y ! ( 2 1 0 c h a r s )
0000350 \n T r y a d d i n g " e c h
0000360 o 1 0 2 4 > / p r o c / s
0000370 y s / d e v / r t c / m a x - u
0000380 s e r - f r e q " t o y o u
0000390 r s y s t e m s t a r t u p
00003a0 s c r i p t s . \n U s i n g
00003b0 u s l e e p ( ) t i m i n g \n
Can't open input config file /home/dol72/.mplayer/input.conf : No such
file or directory
00003c0 I n p u t c o n f i g f i l
00003d0 e / e t c / m p l a y e r / i
00003e0 n p u t . c o n f p a r s e d
00003f0 : 5 2 b i n d s \n \n P l a
0000400 y i n g m m s : / / s e r v e
0000410 r 1 . s t r e a m i n g . c e s
0000420 n e t . c z / o t h e r s / o l
0000430 s / c z e c h . w m a \n R e s o
0000440 l v i n g s e r v e r 1 . s t
0000450 r e a m i n g . c e s n e t . c
0000460 z . . . \n C o n n e c t i n g
0000470 t o s e r v e r s e r v e
0000480 r 1 . s t r e a m i n g . c e s
0000490 n e t . c z [ 1 9 5 . 1 1 3 . 1
Connect error : Connection refused
00004a0 6 1 . 9 9 ] : 8 0 . . . \n R e
00004b0 s o l v i n g s e r v e r 1 .
00004c0 s t r e a m i n g . c e s n e t
00004d0 . c z . . . \n C o n n e c t i
00004e0 n g t o s e r v e r s e r
00004f0 v e r 1 . s t r e a m i n g . c
0000500 e s n e t . c z [ 1 9 5 . 1 1 3
0000510 . 1 6 1 . 9 9 ] : 1 7 5 5 . .
0000520 . \n c o n n e c t e d \n f i l e
0000530 o b j e c t , p a c k e t
0000540 l e n g t h = 1 5 1 6 ( 1
0000550 5 1 6 ) \n u n k n o w n o b j
0000560 e c t \n s t r e a m o b j e c
0000570 t , s t r e a m i d : 1 \n
0000580 u n k n o w n o b j e c t \n u
0000590 n k n o w n o b j e c t \n d a
00005a0 t a o b j e c t \n m m s t p
00005b0 a c k e t _ l e n g t h = 1
00005c0 5 1 6 \n C a c h e s i z e s
00005d0 e t t o 0 K B y t e s \n C
00005e0 o n n e c t e d t o s e r v
00005f0 e r : s e r v e r 1 . s t r e
0000600 a m i n g . c e s n e t . c z \n
0000610 S t r e a m n o t s e e k a
0000620 b l e ! \n A S F f i l e f o
0000630 r m a t d e t e c t e d . \n S
0000640 t r e a m n o t s e e k a b
0000650 l e ! \n C l i p i n f o : \n
0000660 n a m e : P X Í L I ` } L U
0000670 d O U \f K Ý K n G Ú P 032 L
0000680 016 Á B E L S K É Ó D Y \n a u
0000690 t h o r : p Y í l i a ~ l u
00006a0 e o u \r k ý k o H ú p 033 l
00006b0 017 á b e l s k é ó d y \n c o
00006c0 p y r i g h t : \n c o m m e
00006d0 n t s : \n = = = = = = = = = =
00006e0 = = = = = = = = = = = = = = = =
*
0000720 \n O p e n i n g a u d i o d
0000730 e c o d e r : [ f f m p e g ]
0000740 F F m p e g / l i b a v c o d
0000750 e c a u d i o d e c o d e r
0000760 s \n A U D I O : 4 4 1 0 0 H
0000770 z , 2 c h , 1 6 b i t
0000780 ( 0 x 1 0 ) , r a t i o : 8
0000790 0 0 5 - > 1 7 6 4 0 0 ( 6 4 .
00007a0 0 k b i t ) \n S e l e c t e d
00007b0 a u d i o c o d e c : [ f
00007c0 f w m a v 2 ] a f m : f f m p
00007d0 e g ( D i v X a u d i o v
00007e0 2 ( f f m p e g ) ) \n = = = =
00007f0 = = = = = = = = = = = = = = = =
*
0000830 = = = = = = \n I D _ F I L E N A
0000840 M E = m m s : / / s e r v e r 1
0000850 . s t r e a m i n g . c e s n e
0000860 t . c z / o t h e r s / o l s /
0000870 c z e c h . w m a \n I D _ A U D
0000880 I O _ C O D E C = f f w m a v 2
0000890 \n I D _ A U D I O _ F O R M A T
00008a0 = 3 5 3 \n I D _ A U D I O _ B I
00008b0 T R A T E = 6 4 0 4 0 \n I D _ A
00008c0 U D I O _ R A T E = 4 4 1 0 0 \n
00008d0 I D _ A U D I O _ N C H = 2 \n I
00008e0 D _ L E N G T H = 3 0 9 \n \n \n E
00008f0 x i t i n g . . . ( E n d o
0000900 f f i l e ) \n
0000908
-------- Original Message --------
Subject: Re: [MPlayer-users] Messed up charset in a file identification
Date: 15 Jul 2003 16:55:15 +0200
From: Daniel Kenzelmann <kenzelma at stud.uni-frankfurt.de>
To: Ivan Doležal <ivan.dolezal at vsb.cz>
References: <3F13C7A2.3030500 at vsb.cz>
<1058273337.1464.8.camel at ddd.whgl.uni-frankfurt.de>
<3F140B81.20609 at vsb.cz> <3F14111D.4030604 at vsb.cz>
On Tue, 2003-07-15 at 16:35, Ivan Doležal wrote:
btw, i checked and yes, the output is nowhere near any sane encoding.
Maybe the part of the code which delivers that information isn't
internationalized .. keep on telling the developers .. seems like a real
bug.
More information about the MPlayer-users
mailing list