Concurrent requests handling on Google App Engine

前端 未结 3 2033
無奈伤痛
無奈伤痛 2021-01-23 09:29

I was experimenting with concurrent request handling on few platforms.

The aim of the experiment was to have a broad measure of the capacity bounds of s

相关标签:
3条回答
  • 2021-01-23 09:46

    To make optimal usage in terms of minimizing costs you need to configure few things in app.yaml:

    • Enable threadsafe: true - actually it's from Python config and not applicable to Go but I would set it just in case.
    • Adjust scaling section:
      • max_concurrent_requests - set to maximum 80
      • max_idle_instances - set to minimum 0
      • max_pending_latency - set it to automatic or greater then min_pending_latency
      • min_idle_instances - set it to 0
      • min_pending_latency - set to higher number. If you are OK to get 1 second latency and you handlers take on average 100ms to process set it to 900ms.

    Then you should be able to proceed a lot of request on single instance.

    If you OK to burn cash for the sake of responsiveness & scalabiluty - increase min_idle_instances & max_idle_instances.

    Also do you use similar instance types for VM and GAE? The GAE F1 instance is not too fast and is more optimal for async tasks like working with IO (datastore,http,etc.). You can configure usage of more powerful instance to better scale for computation intensive tasks.

    Also do you test on paid account? Free accounts have quotas and AppEngine would refuse percentage of requests if it believe the load would exceed the daily quota if continuous with the same pattern.

    0 讨论(0)
  • 2021-01-23 10:05

    Thanks everyone for their help. Many interesting points and insights have been made by the answers I had on this topic.

    The fact the the Cloud Console were reporting no errors led me to believe that the bottleneck was happening after the real request processing.

    I found the reason why the results were not as expected: bandwidth.

    Each response had a payload of roughly 1MB and thus responding to 500 simultaneous connections from the same client would clog the lines, resulting in timeouts. This was obviously not happening when requesting to the VM, where the bandwith is much larger.

    Now GAE scaling is in line with what I expected: it successfully scales to accomodate each incoming request.

    0 讨论(0)
  • 2021-01-23 10:06

    Extending on Alexander's answer.

    The GAE scaling logic is based on incoming traffic trend analysis.

    The key for being able to handle your case - sudden spikes in traffic (which can't be takes into account in the trend analysis due to its variation speed) - is to have sufficient resident (idle) instances configured for your application to handle such traffic until GAE spins up additional dynamic instances. It can handle as high peaks as you want (if your pockets are deep enough).

    See Scaling dynamic instances for more details.

    0 讨论(0)
提交回复
热议问题