How to OCR image with Tesseract

问题

I am starting to learn OpenCV and Tesseract, and have trouble with what seems to be a very simple example.

Here is an image that I am trying to OCR, that reads "171 m":

I do some preprocessing. Since blue is the dominant color of the text, I extract the blue channel and apply simple thresholding.

img = cv2.imread('171_m.png')[y, x, 0]
_, thresh = cv2.threshold(img, 150, 255, cv2.THRESH_BINARY_INV)

The resulting image looks like this:

Then throw that into Tesseract, with psm 7 for single line:

text = pytesseract.image_to_string(thresh, config='--psm 7')
print(text)
>>> lim

I also tried to restrict possible characters, and it gets a bit better, but not quite.

text = pytesseract.image_to_string(thresh, config='--psm 7 -c tessedit_char_whitelist=1234567890m')
print(text)
>>> 17m

OpenCV v4.1.1.
Tesseract v5.0.0-alpha.20190708

Any help appreciated.

回答1:

Before throwing the image into Pytesseract, preprocessing can help. The desired text should be in black while the background should be in white. Here's an approach

Convert image to grayscale and enlarge image
Gaussian blur
Otsu's threshold
Invert image

After converting to grayscale, we enlarge the image using imutils.resize() and Gaussian blur. From here we Otsu's threshold to get a binary image

If you have noisy images, an additional step would be to use morphological operations to smooth or remove noise. But since your image is clean enough, we can simply invert the image to get our result

Output from Pytesseract using --psm 6

171m

import cv2
import pytesseract
import imutils

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

image = cv2.imread('1.png',0)
image = imutils.resize(image, width=400)
blur = cv2.GaussianBlur(image, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
result = 255 - thresh 

data = pytesseract.image_to_string(result, lang='eng',config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey()

回答2:

Disclaimer : This is not a solution, just a trial to partially solve this.

This process works only if you have knowledge of the number of the characters present in the image beforehand. Here is the trial code :

img0 = cv2.imread('171_m.png', 0)
adap_thresh = cv2.adaptiveThreshold(img0, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
text_adth = pytesseract.image_to_string(adap_thresh, config='--psm 7')

After adaptive thresholding, the produced image is like this :

Pytesseract gives output as :

171 mi.

Now, if you know, in advance, the number of characters present, you can slice the string read by pytesseract and get the desired output as '171m'.

回答3:

I thought your image was not sharp enough, hence I applied the process described at How do I increase the contrast of an image in Python OpenCV to first sharpen your image and then proceed by extracting the blue layer and running the tesseract.

I hope this helps.

import cv2
import pytesseract 

img = cv2.imread('test.png') #test.png is your original image
s = 128
img = cv2.resize(img, (s,int(s/2)), 0, 0, cv2.INTER_AREA)

def apply_brightness_contrast(input_img, brightness = 0, contrast = 0):

    if brightness != 0:
        if brightness > 0:
            shadow = brightness
            highlight = 255
        else:
            shadow = 0
            highlight = 255 + brightness
        alpha_b = (highlight - shadow)/255
        gamma_b = shadow

        buf = cv2.addWeighted(input_img, alpha_b, input_img, 0, gamma_b)
    else:
        buf = input_img.copy()

    if contrast != 0:
        f = 131*(contrast + 127)/(127*(131-contrast))
        alpha_c = f
        gamma_c = 127*(1-f)

        buf = cv2.addWeighted(buf, alpha_c, buf, 0, gamma_c)

    return buf

out = apply_brightness_contrast(img,0,64)

b, g, r = cv2.split(out) #spliting and using just the blue

pytesseract.image_to_string(255-b, config='--psm 7 -c tessedit_char_whitelist=1234567890m') # the 255-b here because the image has black backgorund and white numbers, 255-b switches the colors

来源：https://stackoverflow.com/questions/58103337/how-to-ocr-image-with-tesseract

标签

python

image

OpenCV

ocr

tesseract