[MPlayer-users] Playing an URL with special characters, charset problem

Arthur Helfstein Fragoso arthur at life.net.br
Mon Jan 10 15:58:41 CET 2011


Hello,

Last year I emailed the list about a charset problem. Maybe I was rude 
or something, and I didn't get a reply since then.

On 09/10/2010 08:27 PM, arthur at life.net.br wrote:
 > Hello,
 >
 > I wrote a small script that uses Google text-to-speech as a plugin to a
 > study program called Anki.

On 09/13/2010 02:46 PM, arthur at life.net.br wrote:
>
> Mplayer is playing it well, but the problem is that I ask mplayer to
> play an url that has special characters: â,ê,î,ô,û
> (%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB), but It will play
> $,&,*,+,#(..%2C..%2C..%2C..%2C..) (it will just say a bunch of strange
> things), so it's not playing the right thing, it's like asking mplayer
> to play 'apple.mp3', but mplayer will play 'banana.mp3'. (it's not the
> best example, but..)
>
> I think Tom Evans understood the problem well. maybe he is working on
> it, or he's just busy. :D
>
> sincerely I think it will require to change something in Mplayer's code,
> or maybe it's a network lib used by mplayer. I may be wrong, but that's
> what I think.
>
> I appreciate your help, and I hope somebody will find a solution.
>
> Thank you,
>

Now, the plugin has more than 800 downloads, most of people use it learn 
English, but many of those are probably disappointed that they can't use 
it to learn a language with special characters.

I tried to detail the problem as much as I could, so if it's not enough, 
please tell me what else I need to inform.

I tried with the MPlayer SVN-r32771-4.4.3 running on the terminal.

this time I use a chinese word for the test "你好" (ni hao) and I added 
a romanized word on the front just to show that it can play it when it's 
not a special character. I use the string: "你好hao" (%E4%BD%A0%E5%A5%BDhao)
so in Firefox it will play it right, and say: "ni hao hao"
while if I use mplayer in the terminal it will say just: "hao"
(I have to use the special character because they contain the right 
tones, and the romanization does not.)

so after using a sniffer (Wireshark) to see the socket request packet, I 
got the conclusion that mplayer is getting the right input.
input:
./mplayer 
"http://translate.google.com/translate_tts?tl=zh&q=%E4%BD%A0%E5%A5%BDhao"

mplayer output says: "Playing 
http://translate.google.com/translate_tts?tl=zh&q=%E4%BD%A0%E5%A5%BDhao."
but it makes the wrong http request:
"/translate_tts?tl= zh&q=......hao"

......hao = e4 bd a0 e5 a5 bd 68 61 6f

it should be:

"translate_tts?tl=zh&q=%E4%BD%A0%E5%A5%BDhao"

%E4%BD%A0%E5%A5%BDhao=
25 45 34 25 42 44 25 41 30 25 45 35 25 41 35 25 42 44 68 61 6f


can somebody help me please? details are down below

than you very much



arthur at arthur-laptop ~ $ uname -a
Linux arthur-laptop 2.6.32-21-generic #32-Ubuntu SMP Fri Apr 16 08:10:02 
UTC 2010 i686 GNU/Linux

arthur at arthur-laptop ~ $ ls -l /lib/libc[.-]*
-rwxr-xr-x 1 root root 1335560 2010-10-22 12:30 /lib/libc-2.11.1.so
lrwxrwxrwx 1 root root      14 2010-10-23 14:26 /lib/libc.so.6 -> 
libc-2.11.1.so

arthur at arthur-laptop ~ $ gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 
4.4.3-4ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs 
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr 
--enable-shared --enable-multiarch --enable-linker-build-id 
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext 
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 
--program-suffix=-4.4 --enable-nls --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-plugin --enable-objc-gc 
--enable-targets=all --disable-werror --with-arch-32=i486 
--with-tune=generic --enable-checking=release --build=i486-linux-gnu 
--host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)

arthur at arthur-laptop ~ $ ld -v
GNU ld (GNU Binutils for Ubuntu) 2.20.1-system.20100303

arthur at arthur-laptop ~ $ as --version
GNU assembler (GNU Binutils for Ubuntu) 2.20.1-system.20100303
Copyright 2009 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `i486-linux-gnu'.


Testing mplayer and using a sniffer to get the Socket request packet

./mplayer 
"http://translate.google.com/translate_tts?tl=zh&q=%E4%BD%A0%E5%A5%BDhao"

0000  00 22 2d b1 fa 42 00 23  5a b3 16 de 08 00 45 00   ."-..B.# Z.....E.
0010  00 cb 2b 75 40 00 40 06  81 c7 c0 a8 02 67 4a 7d   ..+u at .@. .....gJ}
0020  7f 64 8f 28 00 50 c4 3b  3c 1a c3 34 89 04 80 18   .d.(.P.; <..4....
0030  00 5c f6 b5 00 00 01 01  08 0a 00 18 17 8a e0 aa   .\...... ........
0040  4c 4f 47 45 54 20 2f 74  72 61 6e 73 6c 61 74 65   LOGET /t ranslate
0050  5f 74 74 73 3f 74 6c 3d  7a 68 26 71 3d e4 bd a0   _tts?tl= zh&q=...
0060  e5 a5 bd 68 61 6f 20 48  54 54 50 2f 31 2e 30 0d   ...hao H TTP/1.0.
0070  0a 48 6f 73 74 3a 20 74  72 61 6e 73 6c 61 74 65   .Host: t ranslate
0080  2e 67 6f 6f 67 6c 65 2e  63 6f 6d 0d 0a 55 73 65   .google. com..Use
0090  72 2d 41 67 65 6e 74 3a  20 4d 50 6c 61 79 65 72   r-Agent:  MPlayer
00a0  20 53 56 4e 2d 72 33 32  32 31 39 2d 34 2e 34 2e    SVN-r32 219-4.4.
00b0  33 0d 0a 49 63 79 2d 4d  65 74 61 44 61 74 61 3a   3..Icy-M etaData:
00c0  20 31 0d 0a 43 6f 6e 6e  65 63 74 69 6f 6e 3a 20    1..Conn ection:
00d0  63 6c 6f 73 65 0d 0a 0d  0a                        close... .

./mplayer 
"http://translate.google.com/translate_tts?tl=zh&q=%E4%BD%A0%E5%A5%BDhao"
MPlayer SVN-r32771-4.4.3 (C) 2000-2010 MPlayer Team

Playing 
http://translate.google.com/translate_tts?tl=zh&q=%E4%BD%A0%E5%A5%BDhao.
Resolving translate.google.com for AF_INET6...

Couldn't resolve name for AF_INET6: translate.google.com
Resolving translate.google.com for AF_INET...
Connecting to server translate.google.com[74.125.127.139]: 80...

Cache size set to 320 KBytes
Cache fill:  1.16% (3794 bytes)

Audio only file format detected.
==========================================================================
Opening audio decoder: [mp3lib] MPEG layer-2, layer-3
AUDIO: 22050 Hz, 2 ch, s16le, 32.0 kbit/4.54% (ratio: 4000->88200)
Selected audio codec: [mp3] afm: mp3lib (mp3lib MPEG layer-2, layer-3)
==========================================================================
AO: [oss] 22050Hz 2ch s16le (2 bytes per sample)
Video: no video
Starting playback...
A:   0.0 (00.0) of 0.9 (00.9)  0.1% 0%


Exiting... (End of file)


Wget do the right request (even tough it won't download the file)

wget 
"http://translate.google.com/translate_tts?tl=zh&q=%E4%BD%A0%E5%A5%BDhao"

0000  00 22 2d b1 fa 42 00 23  5a b3 16 de 08 00 45 00   ."-..B.# Z.....E.
0010  00 d5 49 ae 40 00 40 06  0f e6 c0 a8 02 67 48 0e   ..I. at .@. .....gH.
0020  d5 71 ed 8e 00 50 53 c7  c5 e1 c0 f1 31 c7 80 18   .q...PS. ....1...
0030  00 5c 74 39 00 00 01 01  08 0a 00 15 8c 6c 6b 8f   .\t9.... .....lk.
0040  16 4b 47 45 54 20 2f 74  72 61 6e 73 6c 61 74 65   .KGET /t ranslate
0050  5f 74 74 73 3f 74 6c 3d  7a 68 26 71 3d 25 45 34   _tts?tl= zh&q=%E4
0060  25 42 44 25 41 30 25 45  35 25 41 35 25 42 44 68   %BD%A0%E 5%A5%BDh
0070  61 6f 20 48 54 54 50 2f  31 2e 30 0d 0a 55 73 65   ao HTTP/ 1.0..Use
0080  72 2d 41 67 65 6e 74 3a  20 57 67 65 74 2f 31 2e   r-Agent:  Wget/1.
0090  31 32 20 28 6c 69 6e 75  78 2d 67 6e 75 29 0d 0a   12 (linu x-gnu)..
00a0  41 63 63 65 70 74 3a 20  2a 2f 2a 0d 0a 48 6f 73   Accept:  */*..Hos
00b0  74 3a 20 74 72 61 6e 73  6c 61 74 65 2e 67 6f 6f   t: trans late.goo
00c0  67 6c 65 2e 63 6f 6d 0d  0a 43 6f 6e 6e 65 63 74   gle.com. .Connect
00d0  69 6f 6e 3a 20 4b 65 65  70 2d 41 6c 69 76 65 0d   ion: Kee p-Alive.
00e0  0a 0d 0a                                           ...


Firefox do the right request and uses mplayer to play it.

firefox 
"http://translate.google.com/translate_tts?tl=zh&q=%E4%BD%A0%E5%A5%BDhao"

0000  00 22 2d b1 fa 42 00 23  5a b3 16 de 08 00 45 00   ."-..B.# Z.....E.
0010  05 be ad f7 40 00 40 06  a6 9a c0 a8 02 67 48 0e   .... at .@. .....gH.
0020  d5 8a cb 5a 00 50 27 9f  c5 1d 98 28 69 cc 80 10   ...Z.P'. ...(i...
0030  00 5c 54 91 00 00 01 01  08 0a 00 16 69 09 74 cf   .\T..... ....i.t.
0040  74 3e 47 45 54 20 2f 74  72 61 6e 73 6c 61 74 65   t>GET /t ranslate
0050  5f 74 74 73 3f 74 6c 3d  7a 68 26 71 3d 25 45 34   _tts?tl= zh&q=%E4
0060  25 42 44 25 41 30 25 45  35 25 41 35 25 42 44 68   %BD%A0%E 5%A5%BDh
0070  61 6f 20 48 54 54 50 2f  31 2e 31 0d 0a 48 6f 73   ao HTTP/ 1.1..Hos
0080  74 3a 20 74 72 61 6e 73  6c 61 74 65 2e 67 6f 6f   t: trans late.goo
0090  67 6c 65 2e 63 6f 6d 0d  0a 55 73 65 72 2d 41 67   gle.com. .User-Ag
00a0  65 6e 74 3a 20 4d 6f 7a  69 6c 6c 61 2f 35 2e 30   ent: Moz illa/5.0
00b0  20 28 58 31 31 3b 20 55  3b 20 4c 69 6e 75 78 20    (X11; U ; Linux
00c0  69 36 38 36 3b 20 65 6e  2d 55 53 3b 20 72 76 3a   i686; en -US; rv:
00d0  31 2e 39 2e 32 2e 31 32  29 20 47 65 63 6b 6f 2f   1.9.2.12 ) Gecko/
00e0  32 30 31 30 31 30 32 37  20 4c 69 6e 75 78 20 4d   20101027  Linux M
00f0  69 6e 74 2f 39 20 28 49  73 61 64 6f 72 61 29 20   int/9 (I sadora)
0100  46 69 72 65 66 6f 78 2f  33 2e 36 2e 31 32 0d 0a   Firefox/ 3.6.12..
0110  41 63 63 65 70 74 3a 20  74 65 78 74 2f 68 74 6d   Accept:  text/htm
0120  6c 2c 61 70 70 6c 69 63  61 74 69 6f 6e 2f 78 68   l,applic ation/xh
0130  74 6d 6c 2b 78 6d 6c 2c  61 70 70 6c 69 63 61 74   tml+xml, applicat
0140  69 6f 6e 2f 78 6d 6c 3b  71 3d 30 2e 39 2c 2a 2f   ion/xml; q=0.9,*/
0150  2a 3b 71 3d 30 2e 38 0d  0a 41 63 63 65 70 74 2d   *;q=0.8. .Accept-
0160  4c 61 6e 67 75 61 67 65  3a 20 65 6e 2d 75 73 2c   Language : en-us,
0170  65 6e 3b 71 3d 30 2e 38  2c 6b 6f 3b 71 3d 30 2e   en;q=0.8 ,ko;q=0.
0180  35 2c 7a 68 2d 63 6e 3b  71 3d 30 2e 33 0d 0a 41   5,zh-cn; q=0.3..A
0190  63 63 65 70 74 2d 45 6e  63 6f 64 69 6e 67 3a 20   ccept-En coding:
01a0  67 7a 69 70 2c 64 65 66  6c 61 74 65 0d 0a 41 63   gzip,def late..Ac
01b0  63 65 70 74 2d 43 68 61  72 73 65 74 3a 20 55 54   cept-Cha rset: UT
01c0  46 2d 38 2c 2a 0d 0a 4b  65 65 70 2d 41 6c 69 76   F-8,*..K eep-Aliv
01d0  65 3a 20 31 31 35 0d 0a  43 6f 6e 6e 65 63 74 69   e: 115.. Connecti
01e0  6f 6e 3a 20 6b 65 65 70  2d 61 6c 69 76 65 0d 0a   on: keep -alive..
01f0  43 6f 6f 6b 69 65 3a 20  50 52 45 46 3d 49 44 3d   Cookie:  PREF=ID=
0200  32 38 35 34 63 39 61 66  35 32 38 64 39 62 33 61   2854c9af 528d9b3a
--split, too long---


Thank you very much,

-- 
Arthur Helfstein Fragoso
arthur at life.net.br


More information about the MPlayer-users mailing list