问题
I am starting to learn OpenCV and Tesseract, and have trouble with what seems to be a very simple example.
Here is an image that I am trying to OCR, that reads "171 m":
I do some preprocessing. Since blue is the dominant color of the text, I extract the blue channel and apply simple thresholding.
img = cv2.imread('171_m.png')[y, x, 0]
_, thresh = cv2.threshold(img, 150, 255, cv2.THRESH_BINARY_INV)
The resulting image looks like this:
Then throw that into Tesseract, with psm 7
for single line:
text = pytesseract.image_to_string(thresh, config='--psm 7')
print(text)
>>> lim
I also tried to restrict possible characters, and it gets a bit better, but not quite.
text = pytesseract.image_to_string(thresh, config='--psm 7 -c tessedit_char_whitelist=1234567890m')
print(text)
>>> 17m
OpenCV v4.1.1.
Tesseract v5.0.0-alpha.20190708
Any help appreciated.
回答1:
Before throwing the image into Pytesseract, preprocessing can help. The desired text should be in black while the background should be in white. Here's an approach
- Convert image to grayscale and enlarge image
- Gaussian blur
- Otsu's threshold
- Invert image
After converting to grayscale, we enlarge the image using imutils.resize()
and Gaussian blur. From here we Otsu's threshold to get a binary image
If you have noisy images, an additional step would be to use morphological operations to smooth or remove noise. But since your image is clean enough, we can simply invert the image to get our result
Output from Pytesseract using --psm 6
171m
import cv2
import pytesseract
import imutils
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
image = cv2.imread('1.png',0)
image = imutils.resize(image, width=400)
blur = cv2.GaussianBlur(image, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
result = 255 - thresh
data = pytesseract.image_to_string(result, lang='eng',config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey()
回答2:
Disclaimer : This is not a solution, just a trial to partially solve this.
This process works only if you have knowledge of the number of the characters present in the image beforehand. Here is the trial code :
img0 = cv2.imread('171_m.png', 0)
adap_thresh = cv2.adaptiveThreshold(img0, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
text_adth = pytesseract.image_to_string(adap_thresh, config='--psm 7')
After adaptive thresholding, the produced image is like this :
Pytesseract gives output as :
171 mi.
Now, if you know, in advance, the number of characters present, you can slice the string read by pytesseract and get the desired output as '171m'
.
回答3:
I thought your image was not sharp enough, hence I applied the process described at How do I increase the contrast of an image in Python OpenCV to first sharpen your image and then proceed by extracting the blue layer and running the tesseract.
I hope this helps.
import cv2
import pytesseract
img = cv2.imread('test.png') #test.png is your original image
s = 128
img = cv2.resize(img, (s,int(s/2)), 0, 0, cv2.INTER_AREA)
def apply_brightness_contrast(input_img, brightness = 0, contrast = 0):
if brightness != 0:
if brightness > 0:
shadow = brightness
highlight = 255
else:
shadow = 0
highlight = 255 + brightness
alpha_b = (highlight - shadow)/255
gamma_b = shadow
buf = cv2.addWeighted(input_img, alpha_b, input_img, 0, gamma_b)
else:
buf = input_img.copy()
if contrast != 0:
f = 131*(contrast + 127)/(127*(131-contrast))
alpha_c = f
gamma_c = 127*(1-f)
buf = cv2.addWeighted(buf, alpha_c, buf, 0, gamma_c)
return buf
out = apply_brightness_contrast(img,0,64)
b, g, r = cv2.split(out) #spliting and using just the blue
pytesseract.image_to_string(255-b, config='--psm 7 -c tessedit_char_whitelist=1234567890m') # the 255-b here because the image has black backgorund and white numbers, 255-b switches the colors
来源:https://stackoverflow.com/questions/58103337/how-to-ocr-image-with-tesseract