How to fix CloudRun error 'The request was aborted because there was no available instance'

孤者浪人 提交于 2019-12-06 08:42:47

问题


I'm using managed CloudRun to deploy a container with concurrency=1. Once deployed, I'm firing four long-running requests in parallel. Most of the time, all works fine -- But occasionally, I'm facing 500's from one of the nodes within a few seconds; logs only provide the error message provided in the subject.

Using retry with exponential back-off did not improve the situation; the retries also end up with 500s. StackDriver logs also do not provide further information.

Potentially relevant gcloud beta run deploy arguments:

--memory 2Gi --concurrency 1 --timeout 8m --platform managed

What does the error message mean exactly -- and how can I solve the issue?


回答1:


This error message can appear when the infrastructure didn't scale fast enough to catch up with the traffic spike. Infrastructure only keeps a request in the queue for a certain amount of time (about 10s) then aborts it.

This usually happens when:

  1. traffic suddenly largely increase
  2. cold start time is long
  3. request time is long



回答2:


I also experiment the problem. Easy to reproduce. I have a fibonacci container that process in 6s fibo(45). I use Hey to perform 200 requests. And I set my Cloud Run concurrency to 1.

Over 200 requests I have 8 similar errors. In my case: sudden traffic spike and long processing time. (Short cold start for me, it's in Go)




回答3:


I was able to resolve this on my service by raising the max autoscaling container count from 2 to 10. There really should be no reason that 2 would be even close to too low for the traffic, but I suspect something about the Cloud Run internals were tying up to 2 containers somehow.



来源:https://stackoverflow.com/questions/57007386/how-to-fix-cloudrun-error-the-request-was-aborted-because-there-was-no-availabl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!