Use pytesseract OCR to recognize text from an image

后端 未结 6 738
北恋
北恋 2020-11-30 20:02

I need to use Pytesseract to extract text from this picture:

and the code:

from PIL import Image, ImageEnhance, ImageFilter
import pytesseract
         


        
相关标签:
6条回答
  • 2020-11-30 20:27

    Here is my solution:

    import pytesseract
    from PIL import Image, ImageEnhance, ImageFilter
    
    im = Image.open("temp.jpg") # the second one 
    im = im.filter(ImageFilter.MedianFilter())
    enhancer = ImageEnhance.Contrast(im)
    im = enhancer.enhance(2)
    im = im.convert('1')
    im.save('temp2.jpg')
    text = pytesseract.image_to_string(Image.open('temp2.jpg'))
    print(text)
    
    0 讨论(0)
  • 2020-11-30 20:29

    I have something different pytesseract approach for our community. Here is my approach

    import pytesseract
    from PIL import Image
    text = pytesseract.image_to_string(Image.open("temp.jpg"), lang='eng',
                            config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
    
    print(text)
    
    0 讨论(0)
  • 2020-11-30 20:32

    you only need grow up the size of picture by cv2.resize

    image = cv2.resize(image,(0,0),fx=7,fy=7)
    

    my picture 200x40 -> HZUBS

    resized same picture 1400x300 -> A 1234 (so, this is right)

    and then,

    retval, image = cv2.threshold(image,200,255, cv2.THRESH_BINARY)
    image = cv2.GaussianBlur(image,(11,11),0)
    image = cv2.medianBlur(image,9)
    

    and change parameters for enhance results

    Page segmentation modes:
      0    Orientation and script detection (OSD) only.
      1    Automatic page segmentation with OSD.
      2    Automatic page segmentation, but no OSD, or OCR.
      3    Fully automatic page segmentation, but no OSD. (Default)
      4    Assume a single column of text of variable sizes.
      5    Assume a single uniform block of vertically aligned text.
      6    Assume a single uniform block of text.
      7    Treat the image as a single text line.
      8    Treat the image as a single word.
      9    Treat the image as a single word in a circle.
     10    Treat the image as a single character.
     11    Sparse text. Find as much text as possible in no particular order.
     12    Sparse text with OSD.
     13    Raw line. Treat the image as a single text line,
                bypassing hacks that are Tesseract-specific.
    
    0 讨论(0)
  • 2020-11-30 20:35

    Here is my small advancement with removing noise and arbitrary line within certain colour frequency range.

    import pytesseract
    from PIL import Image, ImageEnhance, ImageFilter
    
    im = Image.open(img)  # img is the path of the image 
    im = im.convert("RGBA")
    newimdata = []
    datas = im.getdata()
    
    for item in datas:
        if item[0] < 112 or item[1] < 112 or item[2] < 112:
            newimdata.append(item)
        else:
            newimdata.append((255, 255, 255))
    im.putdata(newimdata)
    
    im = im.filter(ImageFilter.MedianFilter())
    enhancer = ImageEnhance.Contrast(im)
    im = enhancer.enhance(2)
    im = im.convert('1')
    im.save('temp2.jpg')
    text = pytesseract.image_to_string(Image.open('temp2.jpg'),config='-c tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz -psm 6', lang='eng')
    print(text)
    
    0 讨论(0)
  • 2020-11-30 20:45

    To perform OCR on an image, its important to preprocess the image. Here's a simple approach using OpenCV and Pytesseract OCR. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. From here, we can apply morphological operations to remove noise. Finally we invert the image. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more options.


    Here's a visualization of each step:

    Input image

    Convert to grayscale -> Gaussian blur -> Otsu's threshold

    Notice how there are tiny specs of noise, to remove them we can perform morphological operations

    Finally we invert the image

    Result from Pytesseract OCR

    2HHH
    

    Code

    import cv2
    import pytesseract
    
    pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
    
    # Grayscale, Gaussian blur, Otsu's threshold
    image = cv2.imread('1.png')
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray, (3,3), 0)
    thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    
    # Morph open to remove noise and invert image
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
    opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
    invert = 255 - opening
    
    # Perform text extraction
    data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
    print(data)
    
    cv2.imshow('thresh', thresh)
    cv2.imshow('opening', opening)
    cv2.imshow('invert', invert)
    cv2.waitKey()
    
    0 讨论(0)
  • 2020-11-30 20:51

    To extract the text directly from the web, you can try the following implementation (making use of the first image):

    import io
    import requests
    import pytesseract
    from PIL import Image, ImageFilter, ImageEnhance
    
    response = requests.get('https://i.stack.imgur.com/HWLay.gif')
    img = Image.open(io.BytesIO(response.content))
    img = img.convert('L')
    img = img.filter(ImageFilter.MedianFilter())
    enhancer = ImageEnhance.Contrast(img)
    img = enhancer.enhance(2)
    img = img.convert('1')
    img.save('image.jpg')
    imagetext = pytesseract.image_to_string(img)
    print(imagetext)
    
    0 讨论(0)
提交回复
热议问题