I have a C++ project in Visual Studio 2010 and wish to use OCR. I came across many \"tutorials\" for Tesseract but sadly, all I got was a headache and wasted time.
I
OK, I figured it out but it works for Release and Win32 configuration only (No debug or x64). There are many linking errors under Debug configuration.
So,
1. First of all, download prepared library folder(Tesseract + Leptonica) here:
Mirror 1(Google Drive)
Mirror 2(MediaFire)
2. Extract tesseract.zip
to C:\
3. In Visual Studio, go under C/C++ > General > Additional Include Directories
Insert C:\tesseract\include
4. Under Linker > General > Additional Library Directories
Insert C:\tesseract\lib
5. Under Linker > Input > Additional Dependencies
Add:
liblept168.lib
libtesseract302.lib
Sample code should look like this:
#include <tesseract\baseapi.h>
#include <leptonica\allheaders.h>
#include <iostream>
using namespace std;
int main(void){
tesseract::TessBaseAPI api;
api.Init("", "eng", tesseract::OEM_DEFAULT);
api.SetPageSegMode(static_cast<tesseract::PageSegMode>(7));
api.SetOutputName("out");
cout<<"File name:";
char image[256];
cin>>image;
PIX *pixs = pixRead(image);
STRING text_out;
api.ProcessPages(image, NULL, 0, &text_out);
cout<<text_out.string();
system("pause");
}
For interaction with OpenCV and Mat type images look HERE
It has been a lot since the last reply but it may be help to others;
(This answer must be a comment to Bruce's answer. Sorry for confusion. )
You need to use the library through the API.
Most probably:
start by downlaoding the libs ( https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-3.02.02-win32-lib-include-dirs.zip&can=2&q= ). They're compiled with Visual 2008 but it should be enough
Use the API directly (example, look at an open source project using it: https://code.google.com/p/qtesseract/source/browse/#svn%2Ftrunk%2Ftessdata ) and read the links from this answer : How can i use tesseract ocr(or any other free ocr) in small c++ project?