How to fix CloudRun error 'The request was aborted because there was no available instance'

问题

I'm using managed CloudRun to deploy a container with concurrency=1. Once deployed, I'm firing four long-running requests in parallel. Most of the time, all works fine -- But occasionally, I'm facing 500's from one of the nodes within a few seconds; logs only provide the error message provided in the subject.

Using retry with exponential back-off did not improve the situation; the retries also end up with 500s. StackDriver logs also do not provide further information.

Potentially relevant gcloud beta run deploy arguments:

--memory 2Gi --concurrency 1 --timeout 8m --platform managed

What does the error message mean exactly -- and how can I solve the issue?

回答1:

This error message can appear when the infrastructure didn't scale fast enough to catch up with the traffic spike. Infrastructure only keeps a request in the queue for a certain amount of time (about 10s) then aborts it.

This usually happens when:

traffic suddenly largely increase
cold start time is long
request time is long

回答2:

I also experiment the problem. Easy to reproduce. I have a fibonacci container that process in 6s fibo(45). I use Hey to perform 200 requests. And I set my Cloud Run concurrency to 1.

Over 200 requests I have 8 similar errors. In my case: sudden traffic spike and long processing time. (Short cold start for me, it's in Go)

回答3:

I was able to resolve this on my service by raising the max autoscaling container count from 2 to 10. There really should be no reason that 2 would be even close to too low for the traffic, but I suspect something about the Cloud Run internals were tying up to 2 containers somehow.

来源：https://stackoverflow.com/questions/57007386/how-to-fix-cloudrun-error-the-request-was-aborted-because-there-was-no-availabl

标签

google-cloud-platform

Serverless

google-cloud-run