I have been reviewing replacements for the Office 2007 MODI OCR (OneNote's 2010 solution has lesser quality/results than 2007 :-( ). I notice that Windows 7 contains an OCR library once you install the optional tiff filter
The OCR component gets installed to
%programfiles%\Common Files\microsoft shared\OCR\7.0\xocr3.psp.dll
but I don't see any API for it?
Does anyone see how this can be interfaced preferably in C#?
ANSWER: Found the soluation, once the optional tiff ifilter win7 feature is installed, i can then get a textoutput of a screenshot using the code/exe on http://www.codeproject.com/KB/cs/IFilter.aspx. Also if add the same [HKEY_CLASSES_ROOT.tiff\PersistentHandler] for .png and .jpg then OCR also works for jpg and png's.
- Tessnet OCR is a good solution, but pretty old (last release from 2009). There are couple of very good free OCR solutions available for .NET:
- Asprise C# OCR SDK. Very good and fast one.
- Microsoft Research Project Hawaii
Web-based (cloud) OCR solution with full docs and samples(discontinued 2013) - Bing OCR
Web based (cloud) OCR replacement for above.(discontinued March 2014)
Try TessNet, using the suggestions I made to the Poster in this post (enlarge image, use separate process):
c# OCR can't recognize digits (tesseract 2)
I was exploring the windows 7 dlls and I found 3 libraries that might be useful: thocr.psp.dll ,xocr3.psp.dll, and ximage3b.dll. In this website and other similar websites I found out that ximage3b is a Windows system ocr engine. I have been looking for documentation online but I have not been succesful, but hey! at least I know that it's there, I will give you guys an update if I find out how to use it with C#/C/C++.
来源:https://stackoverflow.com/questions/6100404/windows-7-ocr-api