问题
Following the tutorial here I have the following 2 files:
app.py
from flask import Flask, request
app = Flask(__name__)
@app.route('/', methods=['GET'])
def hello():
"""Return a friendly HTTP greeting."""
who = request.args.get('who', 'World')
return f'Hello {who}!\n'
if __name__ == '__main__':
# Used when running locally only. When deploying to Cloud Run,
# a webserver process such as Gunicorn will serve the app.
app.run(host='localhost', port=8080, debug=True)
Dockerfile
# Use an official lightweight Python image.
# https://hub.docker.com/_/python
FROM python:3.7-slim
# Install production dependencies.
RUN pip install Flask gunicorn
# Copy local code to the container image.
WORKDIR /app
COPY . .
# Service must listen to $PORT environment variable.
# This default value facilitates local development.
ENV PORT 8080
# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
CMD exec gunicorn --bind 0.0.0.0:$PORT --workers 1 --threads 8 app:app
I then build and run them using Cloud Build and Cloud Run:
PROJECT_ID=$(gcloud config get-value project)
DOCKER_IMG="gcr.io/$PROJECT_ID/helloworld-python"
gcloud builds submit --tag $DOCKER_IMG
gcloud run deploy --image $DOCKER_IMG --platform managed
The code appears to run fine, and I am able to access the app on the given URL. However the logs seem to indicate a critical error, and the workers keep restarting. Here is the log file from Cloud Run after starting up the app and making a few requests in my web browser:
2020-03-05T03:37:39.392Z Cloud Run CreateService helloworld-python ...
2020-03-05T03:38:03.285477Z[2020-03-05 03:38:03 +0000] [1] [INFO] Starting gunicorn 20.0.4
2020-03-05T03:38:03.287294Z[2020-03-05 03:38:03 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1)
2020-03-05T03:38:03.287362Z[2020-03-05 03:38:03 +0000] [1] [INFO] Using worker: threads
2020-03-05T03:38:03.318392Z[2020-03-05 03:38:03 +0000] [4] [INFO] Booting worker with pid: 4
2020-03-05T03:38:15.057898Z[2020-03-05 03:38:15 +0000] [1] [INFO] Starting gunicorn 20.0.4
2020-03-05T03:38:15.059571Z[2020-03-05 03:38:15 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1)
2020-03-05T03:38:15.059609Z[2020-03-05 03:38:15 +0000] [1] [INFO] Using worker: threads
2020-03-05T03:38:15.099443Z[2020-03-05 03:38:15 +0000] [4] [INFO] Booting worker with pid: 4
2020-03-05T03:38:16.320286ZGET200 297 B 2.9 s Safari 13 https://helloworld-python-xhd7w5igiq-ue.a.run.app/
2020-03-05T03:38:16.489044ZGET404 508 B 6 ms Safari 13 https://helloworld-python-xhd7w5igiq-ue.a.run.app/favicon.ico
2020-03-05T03:38:21.575528ZGET200 288 B 6 ms Safari 13 https://helloworld-python-xhd7w5igiq-ue.a.run.app/
2020-03-05T03:38:27.000761ZGET200 285 B 5 ms Safari 13 https://helloworld-python-xhd7w5igiq-ue.a.run.app/?who=me
2020-03-05T03:38:27.347258ZGET404 508 B 13 ms Safari 13 https://helloworld-python-xhd7w5igiq-ue.a.run.app/favicon.ico
2020-03-05T03:38:34.802266Z[2020-03-05 03:38:34 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:4)
2020-03-05T03:38:35.302340Z[2020-03-05 03:38:35 +0000] [4] [INFO] Worker exiting (pid: 4)
2020-03-05T03:38:48.803505Z[2020-03-05 03:38:48 +0000] [5] [INFO] Booting worker with pid: 5
2020-03-05T03:39:10.202062Z[2020-03-05 03:39:09 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:5)
2020-03-05T03:39:10.702339Z[2020-03-05 03:39:10 +0000] [5] [INFO] Worker exiting (pid: 5)
2020-03-05T03:39:18.801194Z[2020-03-05 03:39:18 +0000] [6] [INFO] Booting worker with pid: 6
Note the worker timeouts and reboots at the end of the logs. The fact that its a CRITICAL error makes me think it shouldn't be happing. Is this expected behavior? Is this a side effect of the Cloud Run machinery starting and stopping my service as requests come and go?
回答1:
Cloud Run has scaled down one of your instances, and the gunicorn
arbiter is considering it stalled.
You should add --timeout 0
to your gunicorn
invocation to disable the worker timeout entirely, it's unnecessary for Cloud Run.
回答2:
Here's a working example of a Flask app on Cloud run. My guess is that your last line or the Decker file and the last part of your python file are the ones causing this behavior.
main.py
# main.py
#gcloud beta run services replace service.yaml
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello_world():
msg = "Hello World"
return msg
Dockerfile (the apt-get part is not needed)
# Use the official Python image.
# https://hub.docker.com/_/python
FROM python:3.7
# Install manually all the missing libraries
RUN apt-get update
RUN apt-get install -y gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils
# Install Python dependencies.
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . .
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 main:app
then build using:
gcloud builds submit --tag gcr.io/[PROJECT]/[MY_SERVICE]
and deploy:
gcloud beta run deploy [MY_SERVICE] --image gcr.io/[PROJECT]/[MY_SERVICE] --region europe-west1 --platform managed
UPDATE I've checked again the logs you've provided. Getting this kind of warning/error is normal at the beginning after a new deployment as your old instances are not handling any requests but instead they are idle at that time until they are completely shut down.
Gunicorn also has a default timeout of 30s which matches with the time between the time of "Booting worker" and the time you see the error.
来源:https://stackoverflow.com/questions/60537977/critical-worker-timeout-in-logs-when-running-hello-cloud-run-with-python-f