Can not make tesseract work in google app engine with python3

后端 未结 1 857
太阳男子
太阳男子 2021-01-14 18:39

I am trying to deploy an application on the Google App Engine that also has OCR function. I downloaded the tesseract using homebrew and using pytesseract to wra

相关标签:
1条回答
  • 2021-01-14 19:28

    The Google App Engine Standard environment is not suitable for your use case. It is true that the pytesseract and the Pillow libraries can be installed via pip. But these libraries require the tesseract-ocr and libtesseract-dev platform packages to be installed, which don't come in the base runtime for App Engine Standard Python3.7 runtime. This is producing the error you are getting.

    The solution is to use Cloud Run, which will run your application in a Docker container and you will be able to customize your runtime. I have modified this Quickstart guide to run on Cloud Run a sample application that converts an image to text using pytesseract.

    My folder structure:

    ├── sample
        ├── requirements.txt
        └── Dockerfile
        └── app.py
        └── test.png
    

    Here is the Dockerfile:

    # Use the official Python image.
    # https://hub.docker.com/_/python
    FROM python:3.7
    
    # Copy local code to the container image.
    ENV APP_HOME /app
    WORKDIR $APP_HOME
    COPY . ./
    
    # Install production dependencies.
    RUN pip install Flask gunicorn
    RUN pip install -r requirements.txt
    
    #Install tesseract
    RUN apt-get update -qqy && apt-get install -qqy \
            tesseract-ocr \
            libtesseract-dev
    
    # Run the web service on container startup. Here we use the gunicorn
    # webserver, with one worker process and 8 threads.
    # For environments with multiple CPU cores, increase the number of workers
    # to be equal to the cores available.
    CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 app:app
    

    The contents of app.py:

    from flask import Flask
    from PIL import Image
    import pytesseract
    
    
    # If `entrypoint` is not defined in app.yaml, App Engine will look for an app
    # called `app` in `main.py`.
    app = Flask(__name__)
    
    @app.route('/')
    def hello():
        return pytesseract.image_to_string(Image.open('test.png'))
    
    
    if __name__ == "__main__":
        app.run(debug=True,host='0.0.0.0',port=int(os.environ.get('PORT', 8080)))
    

    The requirements.txt:

    Flask==1.1.1
    pytesseract==0.3.0
    Pillow==6.2.0
    

    Now to containerize and deploy your application just run:

    1. gcloud builds submit --tag gcr.io/<PROJECT_ID>/helloworld to build and submit the container to Container Registry.

    2. gcloud beta run deploy --image gcr.io/<PROJECT_ID>/helloworld --platform managed to deploy the container to Cloud Run.

    0 讨论(0)
提交回复
热议问题