use pytesseract to recognize text from image

自闭症网瘾萝莉.ら 提交于 2019-11-28 03:59:55
Smith John

Here is my solution:

import pytesseract
from PIL import Image, ImageEnhance, ImageFilter

im ="temp.jpg") # the second one 
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')'temp2.jpg')
text = pytesseract.image_to_string('temp2.jpg'))

To extract the text directly from the web, you can try the following implementation (making use of the first image):

import io
import requests
import pytesseract
from PIL import Image, ImageFilter, ImageEnhance

response = requests.get('')
img =
img = img.convert('L')
img = img.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(img)
img = enhancer.enhance(2)
img = img.convert('1')'image.jpg')
imagetext = pytesseract.image_to_string(img)

Here is my small advancement with removing noise and arbitrary line within certain colour frequency range.

import pytesseract
from PIL import Image, ImageEnhance, ImageFilter

im =  # img is the path of the image 
im = im.convert("RGBA")
newimdata = []
datas = im.getdata()

for item in datas:
    if item[0] < 112 or item[1] < 112 or item[2] < 112:
        newimdata.append((255, 255, 255))

im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')'temp2.jpg')
text = pytesseract.image_to_string('temp2.jpg'),config='-c tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz -psm 6', lang='eng')

I have something different pytesseract approach for our community. Here is my approach

import pytesseract
from PIL import Image
text = pytesseract.image_to_string("temp.jpg"), lang='eng',
                        config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')


you only need grow up the size of picture by cv2.resize

image = cv2.resize(image,(0,0),fx=7,fy=7)

my picture 200x40 -> HZUBS

resized same picture 1400x300 -> A 1234 (so, this is right)

and then,

retval, image = cv2.threshold(image,200,255, cv2.THRESH_BINARY)
image = cv2.GaussianBlur(image,(11,11),0)
image = cv2.medianBlur(image,9)

and change parameters for enhance results

Page segmentation modes:
  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR.
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.
 11    Sparse text. Find as much text as possible in no particular order.
 12    Sparse text with OSD.
 13    Raw line. Treat the image as a single text line,
            bypassing hacks that are Tesseract-specific.