[MPlayer-users] Playing an URL with special characters, charset problem

arthur at life.net.br arthur at life.net.br
Sat Sep 11 09:58:02 CEST 2010


On 09/10/2010 11:56 PM, arthur at life.net.br wrote:
> On 09/10/2010 08:58 AM, Tom Evans wrote:
>> On Fri, Sep 10, 2010 at 12:27 PM,
>> arthur at life.net.br<arthur at life.net.br> wrote:
>>> Hello,
>>>
>>> I wrote a small script that uses Google text-to-speech as a plugin to a
>>> study program called Anki.
>>>
>>> I'm using "subprocess.Popen" in python to run Mplayer. It works all
>>> right
>>> with normal characters. But when I use a special character, it won't.
>>>
>>> So I made some tests, if I run mplayer in the terminal like this:
>>>
>>> ~/mplayer-checkout-2010-09-09 $ ./mplayer
>>> "http://translate.google.com/translate_tts?tl=fr&q=â,ê,î,ô,û"
>>>
>>> It will say some weird things that is not right, if I look at the
>>> packet it
>>> sends to google, it shows:
>>> 0x0040: 5aa9 4745 5420 2f74 7261 6e73 6c61 7465 Z.GET./translate
>>> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a22c _tts?tl=fr&q=..,
>>> 0x0060: c3aa 2cc3 ae2c c3b4 2cc3 bb20 4854 5450 ..,..,..,...HTTP
>>>
>>> but if I use Firefox, and i use the same url, it will play it all
>>> right and
>>> say those letter (in french):
>>> 0x0040: aaf8 4745 5420 2f74 7261 6e73 6c61 7465 ..GET./translate
>>> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3d25 4333 _tts?tl=fr&q=%C3
>>> 0x0060: 2541 322c 2543 3325 4141 2c25 4333 2541 %A2,%C3%AA,%C3%A
>>> 0x0070: 452c 2543 3325 4234 2c25 4333 2542 4220 E,%C3%B4,%C3%BB.
>>> 0x0080: 4854 5450 2f31 2e31 0d0a 486f 7374 3a20 HTTP/1.1..Host:.
>>>
>>> so, am I doing something wrong? Is there any parameter that I should
>>> set to
>>> read the right charset?
>>>
>>> sorry that I don't know too much about charsets.
>>>
>>> I'm testing on Linux, and I build the latest mplayer version I could
>>> find
>>> (checkout-2010-09-09)
>>>
>>> I appreciate any help
>>>
>>> Kind regards,
>>>
>>> Arthur Helfstein Fragoso
>>> arthur at life.net.br
>>
>> URLs dont have really a charset, but any unicode characters should be
>> percent encoded. Firefox allows you to type in your local charset, and
>> then does it's magic to turn that into the actual URL, whilst still
>> displaying what you typed in.
>>
>> For instance, if you typed in the URL '...?q=â', Firefox requests the
>> URL '...?q=%C3%A2'. This is non-standard, but seemingly supported by
>> all browsers/servers.
>>
>> To do this in python, assuming you have a dictionary of URL arguments
>> called args, with the keys being ascii strings (or unicode strings
>> that can convert to ascii), and the values being any unicode string:
>>
>> from urllib import quote_plus
>> '&'.join([ '%s=%s' % (k, quote_plus(v.encode('utf-8'))) for k,v in
>> args.items() ])
>>
>> Hope that helps
>>
>> Cheers
>>
>> Tom
>> _______________________________________________
>> MPlayer-users mailing list
>> MPlayer-users at mplayerhq.hu
>> https://lists.mplayerhq.hu/mailman/listinfo/mplayer-users
>
> Tom,
>
> Thank you for the clarification, but I tried and no success:
>
> by terminal: (only the comma(, %2C) was passed right)
> mplayer
> "http://translate.google.com/translate_tts?tl=fr&q=%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB"
>
> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a225 _tts?tl=fr&q=..%
> 0x0060: 3243 c3aa 2532 43c3 ae25 3243 c3b4 2532 2C..%2C..%2C..%2
> 0x0070: 43c3 bb20 4854 5450 2f31 2e30 0d0a 486f C...HTTP/1.0..Ho
>
> I thought to try to escape the % so I tried:
> mplayer
> "http://translate.google.com/translate_tts?tl=fr&q=\%C3\%A2,\%C3\%AA,\%C3\%AE,\%C3\%B4,\%C3\%BB"
>
> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3d25 3543 _tts?tl=fr&q=%5C
> 0x0060: c325 3543 a22c 2535 43c3 2535 43aa 2c25 .%5C.,%5C.%5C.,%
> 0x0070: 3543 c325 3543 ae2c 2535 43c3 2535 43b4 5C.%5C.,%5C.%5C.
> 0x0080: 2c25 3543 c325 3543 bb20 4854 5450 2f31 ,%5C.%5C..HTTP/1
>
> without the "" and I had to escape the &
> mplayer
> http://translate.google.com/translate_tts?tl=fr\&q=%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB
>
> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a225 _tts?tl=fr&q=..%
> 0x0060: 3243 c3aa 2532 43c3 ae25 3243 c3b4 2532 2C..%2C..%2C..%2
> 0x0070: 43c3 bb20 4854 5450 2f31 2e30 0d0a 486f C...HTTP/1.0..Ho
>
> escaping the % again
> mplayer
> http://translate.google.com/translate_tts?tl=fr\&q=\%C3\%A2\%2C\%C3\%AA\%2C\%C3\%AE\%2C\%C3\%B4\%2C\%C3\%BB
>
> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a225 _tts?tl=fr&q=..%
> 0x0060: 3243 c3aa 2532 43c3 ae25 3243 c3b4 2532 2C..%2C..%2C..%2
> 0x0070: 43c3 bb20 4854 5450 2f31 2e30 0d0a 486f C...HTTP/1.0..Ho
>
>
> So I also tried on python:
> address =
> 'http://translate.google.com/translate_tts?tl='+TTS_language+'&q='+
> quote_plus(text.encode('utf-8'))
> #http://translate.google.com/translate_tts?tl=fr&q=%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB
>
> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a225 _tts?tl=fr&q=..%
> 0x0060: 3243 c3aa 2532 43c3 ae25 3243 c3b4 2532 2C..%2C..%2C..%2
> 0x0070: 43c3 bb20 4854 5450 2f31 2e30 0d0a 486f C...HTTP/1.0..Ho
>
> I also tried with quotes around the arg "q"
> address =
> 'http://translate.google.com/translate_tts?tl='+TTS_language+'&q="'+
> quote_plus(text.encode('utf-8'))+'"'
> #http://translate.google.com/translate_tts?tl=fr&q="%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB"
>
> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3d25 3232 _tts?tl=fr&q=%22
> 0x0060: c3a2 2532 43c3 aa25 3243 c3ae 2532 43c3 ..%2C..%2C..%2C.
> 0x0070: b425 3243 c3bb 2532 3220 4854 5450 2f31 .%2C..%22.HTTP/1
>
> I'm using this line to run mplayer on python
> subprocess.Popen(['mplayer', '-slave', address], stdin=PIPE,
> stdout=PIPE, stderr=STDOUT)
>
> And if I try to run on firefox3.6 the url
> http://translate.google.com/translate_tts?tl=fr&q=%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB
>
> it will refresh the address bar and it will be:
> http://translate.google.com/translate_tts?tl=fr&q=â%2Cê%2Cî%2Cô%2Cû
> and It will play it all right:
> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3d25 4333 _tts?tl=fr&q=%C3
> 0x0060: 2541 3225 3243 2543 3325 4141 2532 4325 %A2%2C%C3%AA%2C%
> 0x0070: 4333 2541 4525 3243 2543 3325 4234 2532 C3%AE%2C%C3%B4%2
> 0x0080: 4325 4333 2542 4220 4854 5450 2f31 2e31 C%C3%BB.HTTP/1.1
>
> so I don't know how to make mplayer make the right request. :/
>
> any idea?
>

I just thought a little more, and actually it's firefox the one who 
download the sound file, and mplayer just play it, isn't it?

so I made another test:
wget "http://translate.google.com/translate_tts?tl=fr&q=â,ê,î,ô,û"
	0x0040:  3e5d 4745 5420 2f74 7261 6e73 6c61 7465  >]GET./translate
	0x0050:  5f74 7473 3f74 6c3d 6672 2671 3d25 4333  _tts?tl=fr&q=%C3
	0x0060:  2541 322c 2543 3325 4141 2c25 4333 2541  %A2,%C3%AA,%C3%A
	0x0070:  452c 2543 3325 4234 2c25 4333 2542 4220  E,%C3%B4,%C3%BB.
	0x0080:  4854 5450 2f31 2e30 0d0a 5573 6572 2d41  HTTP/1.0..User-A
	0x0090:  6765 6e74 3a20 5767 6574 2f31 2e31 3220  gent:.Wget/1.12.

wget 
"http://translate.google.com/translate_tts?tl=fr&q=%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB"
	0x0050:  5f74 7473 3f74 6c3d 6672 2671 3d25 4333  _tts?tl=fr&q=%C3
	0x0060:  2541 3225 3243 2543 3325 4141 2532 4325  %A2%2C%C3%AA%2C%
	0x0070:  4333 2541 4525 3243 2543 3325 4234 2532  C3%AE%2C%C3%B4%2
	0x0080:  4325 4333 2542 4220 4854 5450 2f31 2e30  C%C3%BB.HTTP/1.0
	0x0090:  0d0a 5573 6572 2d41 6765 6e74 3a20 5767  ..User-Agent:.Wg
	0x00a0:  6574 2f31 2e31 3220 286c 696e 7578 2d67  et/1.12.(linux-g

with wget I can't download the google's tts sound file =/, but it makes 
the right request, and I also tried with "curl -v" that do the same wget 
does, and it also did the right request.
So from this I guess that it's mplayer that is not making the right request.

So.. could you check for me?

Thank you very much,

-- 
Arthur Helfstein Fragoso
arthur at life.net.br


More information about the MPlayer-users mailing list