[MPlayer-users] subtitle missing

Nicolas George nicolas.george at normalesup.org
Wed Feb 18 22:36:36 CET 2009


Le decadi 30 pluviôse, an CCXVII, Mike Castle a écrit :
> There are a number of solutions that will extract the images, run them
> through OCR software, and then you manually proofread them for
> accuracy.  I have no idea how well they work for non-English though (I
> imagine they use spell correction to help in the OCR phase).

I have written one of these solutions, and it uses an ORC tool specialized
for subtitles, where glyphs are pixel-exact. It can be tricky to use,
especially when it comes to colon and semicolon and to correct a mistake,
but under good circumstances it can achieve a perfect extraction of the text
in a short time. It works well with non-ASCII languages, and has another
rare feature: it can recognize and keep italics.

You may want to give it a try. The source code is there:
http://gitorious.org/projects/exocr/repos/mainline

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-users/attachments/20090218/3b5bbbbb/attachment.pgp>


More information about the MPlayer-users mailing list