AW: Tesseract 3.04 - Cygwin64 - Windows 8.1 - Can't open makebox
Schmitz, Marco
marco.schmitz@adesso-mobile.de
Tue Sep 22 12:19:00 GMT 2015
Okay, my shell script problem "not finding makebox" was a line ending problem (CR+LF).
But how about TESSDATA_PREFIX ?
-----Ursprüngliche Nachricht-----
Von: cygwin-owner@cygwin.com [mailto:cygwin-owner@cygwin.com] Im Auftrag von Schmitz, Marco
Gesendet: Dienstag, 22. September 2015 13:23
An: Marco Atzeri <marco.atzeri@gmail.com>; cygwin@cygwin.com
Betreff: AW: Tesseract 3.04 - Cygwin64 - Windows 8.1 - Can't open makebox
Hi Marco,
without setting TESSDATA_PREFIX (neither Windows environment variables nor .bash_profile) I get:
$ tesseract --list-langs
Error opening data file C:\DEV\tesseract\Tesseract-OCR\tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.
This is my first problem, which I solved defining TESSDATA_PREFIX (in Windows environment). Now I get:
$ tesseract --list-langs
List of available languages (13):
arbeitsunfaehigkeit
deu
deu_frak
eng
fra
ita
ita_old
nld
osd
por
spa
spa_old
vie
Then I try this:
$ tesseract arbeitsunfaehigkeit.hausarzt.exp0.jpg arbeitsunfaehigkeit batch.nochop makebox Tesseract Open Source OCR Engine v3.04.00 with Leptonica
Okay, but originally I wrote this issue because I tried to call it from a shell script. So, this is my box.sh:
#!/usr/bin/env bash
tesseract arbeitsunfaehigkeit.hausarzt.exp0.jpg arbeitsunfaehigkeit batch.nochop makebox
and calling it brings up the original error:
$ ./box.sh
Tesseract Open Source OCR Engine v3.04.00 with Leptonica
read_params_file: Can't open makebox
Best regards,
Marco
-----Ursprüngliche Nachricht-----
Von: cygwin-owner@cygwin.com [mailto:cygwin-owner@cygwin.com] Im Auftrag von Marco Atzeri
Gesendet: Montag, 21. September 2015 16:15
An: cygwin@cygwin.com
Betreff: Re: Tesseract 3.04 - Cygwin64 - Windows 8.1 - Can't open makebox
On 21/09/2015 11:03, Schmitz, Marco wrote:
> I am using Windows 8.1 and Cygwin64 in order to run Tesseract 3.04.
>
> Running the following command:
>
> tesseract arbeitsunfaehigkeit.hausarzt.exp0.jpg
> arbeitsunfaehigkeit batch.nochop makebox
>
> results in the following output:
>
> Tesseract Open Source OCR Engine v3.04.00 with Leptonica
> read_params_file: Can't open makebox
>
> And this is after I fixed the output:
>
> Tesseract Open Source OCR Engine v3.04.00 with Leptonica
> Error opening data file
> C:\DEV\tesseract\Tesseract-OCR\tessdata/eng.traineddata
Are you defining TESSDATA_PREFIX ? Why ?
> Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
> Failed loading language 'eng'
> Tesseract couldn't load any languages!
> Could not initialize tesseract.
>
> Using the following line in .bash_profile:
>
> export TESSDATA_PREFIX="/cygdrive/c/DEV/cygwin64/usr/share/tessdata/"
The default should be
TESSDATA_PREFIX="/usr/share/tessdata/"
Without defining TESSDATA_PREFIX, I have
$ tesseract.exe --list-langs
List of available languages (4):
deu
deu_frak
eng
osd
and the language files are in :
$ ls /usr/share/tessdata/
configs/ eng.cube.fold eng.cube.size
osd.traineddata
deu.traineddata eng.cube.lm eng.cube.word-freq pdf.ttf
deu_frak.traineddata eng.cube.nn eng.tesseract_cube.nn tessconfigs/
eng.cube.bigrams eng.cube.params eng.traineddata training/
Regards
Marco
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
More information about the Cygwin
mailing list