I am trying to deploy an application on the Google App Engine that also has OCR function. I downloaded the tesseract using homebrew and using pytesseract
to wra
The Google App Engine Standard environment is not suitable for your use case. It is true that the pytesseract and the Pillow libraries can be installed via pip
. But these libraries require the tesseract-ocr and libtesseract-dev platform packages to be installed, which don't come in the base runtime for App Engine Standard Python3.7 runtime. This is producing the error you are getting.
The solution is to use Cloud Run, which will run your application in a Docker container and you will be able to customize your runtime. I have modified this Quickstart guide to run on Cloud Run a sample application that converts an image to text using pytesseract
.
My folder structure:
├── sample
├── requirements.txt
└── Dockerfile
└── app.py
└── test.png
Here is the Dockerfile
:
# Use the official Python image.
# https://hub.docker.com/_/python
FROM python:3.7
# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./
# Install production dependencies.
RUN pip install Flask gunicorn
RUN pip install -r requirements.txt
#Install tesseract
RUN apt-get update -qqy && apt-get install -qqy \
tesseract-ocr \
libtesseract-dev
# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 app:app
The contents of app.py
:
from flask import Flask
from PIL import Image
import pytesseract
# If `entrypoint` is not defined in app.yaml, App Engine will look for an app
# called `app` in `main.py`.
app = Flask(__name__)
@app.route('/')
def hello():
return pytesseract.image_to_string(Image.open('test.png'))
if __name__ == "__main__":
app.run(debug=True,host='0.0.0.0',port=int(os.environ.get('PORT', 8080)))
The requirements.txt
:
Flask==1.1.1
pytesseract==0.3.0
Pillow==6.2.0
Now to containerize and deploy your application just run:
gcloud builds submit --tag gcr.io/<PROJECT_ID>/helloworld
to build and submit the container to Container Registry.
gcloud beta run deploy --image gcr.io/<PROJECT_ID>/helloworld --platform managed
to deploy the container to Cloud Run.