How to represent:
Have you seen this?
https://code.google.com/p/tesseract-ocr/issues/detail?id=581
The bug list shows it as "no longer an issue".
baseApi.setVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");
code before the init Tesseract
You must set the "page segmentation mode" to "single char".
For example, in Android you do the following:
api.setPageSegMode(TessBaseAPI.pageSegMode.PSM_SINGLE_CHAR);
python code to do that configuration is like this:
import pytesseract
import cv2
img = cv2.imread("path to some image")
pytesseract.image_to_string(
img, config=("-c tessedit"
"_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
" --psm 10"
" -l osd"
" "))
the --psm
flag defines the page segmentation mode.
according to documentaion of tesseract, 10
means :
Treat the image as a single character.
so to recognize a single character you just need to use : --psm 10
flag.
You need to set Tesseract's page segmentation mode to "single character."