问题

I am trying to use tesseract-OCR to print text from the image. But I am getting the above error. I have installed tesseract OCR using https://github.com/UB-Mannheim/tesseract/wiki and pytesseract in the anaconda prompt using pip install pytesseract but its not working. Please help if anyone has faced the similar issue.

(base) C:\Users\500066016>pip install pytesseract Collecting pytesseract Downloading https://files.pythonhosted.org/packages/13/56/befaafbabb36c03e4fdbb3fea854e0aea294039308a93daf6876bf7a8d6b/pytesseract-0.2.4.tar.gz (169kB) 100% |████████████████████████████████| 174kB 288kB/s Requirement already satisfied: Pillow in c:\users\500066016\appdata\local\continuum\anaconda3\lib\site-packages (from pytesseract) (5.1.0) Building wheels for collected packages: pytesseract Running setup.py bdist_wheel for pytesseract ... done Stored in directory: C:\Users\500066016\AppData\Local\pip\Cache\wheels\a8\0c\00\32e4957a46128bea34fda60b8b01a8755986415cbab3ed8e38 Successfully built pytesseract

Below is the code:

import pytesseract
import cv2
import numpy as np

def get_string(img_path):
    img = cv2.imread(img_path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    kernel = np.ones((1,1), np.uint8)
    dilate = cv2.dilate(img, kernel, iterations=1)
    erosion = cv2.erode(img, kernel, iterations=1)

    cv2.imwrite('removed_noise.jpg', img)
    img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
    cv2.imwrite('thresh.jpg', img)
    res = pytesseract.image_to_string('thesh.jpg')
    return res
print('Getting string from the image')
print(get_string('quotes.jpg'))

Below is the error:

Traceback (most recent call last):

File "", line 1, in runfile('C:/Users/500066016/.spyder-py3/project1.py', wdir='C:/Users/500066016/.spyder-py3')

File "C:\Users\500066016\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile execfile(filename, namespace)

File "C:\Users\500066016\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/500066016/.spyder-py3/project1.py", line 23, in print(get_string('quotes.jpg'))

File "C:/Users/500066016/.spyder-py3/project1.py", line 20, in get_string res = pytesseract.image_to_string('thesh.jpg')

File "C:\Users\500066016\AppData\Local\Continuum\anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 294, in image_to_string return run_and_get_output(*args)

File "C:\Users\500066016\AppData\Local\Continuum\anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 202, in run_and_get_output run_tesseract(**kwargs)

File "C:\Users\500066016\AppData\Local\Continuum\anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 172, in run_tesseract raise TesseractNotFoundError()

TesseractNotFoundError: tesseract is not installed or it's not in your path

回答1:

Step 1: Download and install Tesseract OCR from this link.

Step 2: After installing find the "Tesseract-OCR" folder, double Click on this folder and find the tesseract.exe.

Step 3: After finding the tesseract.exe, copy the file location.

Step 4: Pass this location into your code like this

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

Note: C:\Program Files\Tesseract-OCR\tesseract.exe == your copied location

回答2:

You should to install : ! apt install tesseract-ocr ! apt install libtesseract-dev

And

! pip install Pillow ! pip install pytesseract

import pytesseract from PIL import ImageEnhance, ImageFilter, Image

I have code on Cola from google drive to run. Below of my example code:

I took any example picture of text on website

Step 1: import some packages

import pytesseract
import cv2
import matplotlib.pyplot as plt
from PIL import Image

Step 2 : Upload file of text.png on Colab

from google.colab import files
uploaded = files.upload()

current browser session. Please rerun this cell to enable.
---------------------------------------------------------------------------
MessageError                              Traceback (most recent call last)
<ipython-input-31-21dc3c638f66> in <module>()
      1 from google.colab import files
----> 2 uploaded = files.upload()

2 frames
/usr/local/lib/python3.6/dist-packages/google/colab/_message.py in read_reply_from_input(message_id, timeout_sec)
    104         reply.get('colab_msg_id') == message_id):
    105       if 'error' in reply:
--> 106         raise MessageError(reply['error'])
    107       return reply.get('data', None)
    108 
MessageError: TypeError: Cannot read property '_uploadFiles' of undefined

-> Don't worry, please run code again it will accept it. And then, you could choose which if you want to upload

Step 3 :

read the image using OpenCV

image = cv2.imread("text.png")
or you can use Pillow

image = Image.open("text.png")
check it. Have they show file text picture.

image

get the string

string = pytesseract.image_to_string(image)

print it

print(string)

Done. Helpful you..

回答3:

it is clear from the error that your system is unable to find tesseract package if you are on windows simply run following command in your command prompt.

pip install tesseract

hope it will solve your problem :)

来源：https://stackoverflow.com/questions/51677283/tesseractnotfounderror-tesseract-is-not-installed-or-its-not-in-your-path

标签

python

python-3.x

image-processing

python-tesseract