How to install leptonica+tesseract on Windows without Visual Studio to use in Anaconda?

独自空忆成欢 提交于 2019-12-10 09:43:16

问题


I wanted to perform text recognition from images and I want to use Python. I installed Anaconda. Now I want to install Tesseract but I also need to install Leptonica. I did not find any clear instruction how to do it in windows. For Leptonica I do not want to install Visual Studio. So could anybody provide clear instructions how to install leptonica and tesseract on Windows without Visual Studio to use in anaconda ? Thanks.


回答1:


Here is simple set of steps to have tesseract 3.05 dev version as of 04/22/2016 working both on windows 7 and windows 8 machines:

1- install tesseract from its executable from official tesseract-ocr page (version 3.02 for windoes will suffice)

2- download the following two files for tesseract 3.05 dev version from http://domasofan.spdns.eu/tesseract/

There are 2 exe files:

  • tesseract-core-yyyymmdd.exe Tesseract core application without language data
  • tesseract-langs-yyyymmdd.exe All the language data available for Tesseract.

(yyyymmdd means year 4 digits, month 2 digits and day 2 digits.)

The app is portable so you can install it on a USB stick or in another location.

sub Steps to install these:

  1. Download the tesseract-core and tesseract-langs packages.
  2. Double click the tesseract-core package and extract it to a directory where you want it to be (a temporary new folder called "Tess_temp").
  3. Double click the tesseract-langs package and extract it to the same directory but add \tessdata to it in the above "Tess_temp" folder. For example if i would have extracted tesseract-core to c:\Tess_temp, tesseract-langs needs to go to c:\Tess_temp\tessdata.

  4. Now copy what ever you have in "Tess_temp" to where tesseract 3.02 was installed in step 1 above (its usially in C:\Program Files (x86)\Tesseract-OCR) (replace 3.02 materials with 3.05 )

  5. It should work now with the 3.05 version on windows. copy a sample image test.png (with text) to this tesseract-ocr folder and open a cmd and type in the following commands:

    go to tesseract folder: cd C:\Program Files <x86>\Tesseract-OCR

    run tesseract on test.png: tesseract -l eng test.png test_text -psm 6

it will show you

Tesseract Open Source OCR Engine v3.05.00dev with Leptonica

congratulations ! (check test_txt.txt for the extracted text)



来源:https://stackoverflow.com/questions/35270473/how-to-install-leptonicatesseract-on-windows-without-visual-studio-to-use-in-an

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!