I was experimenting with concurrent request handling on few platforms.
The aim of the experiment was to have a broad measure of the capacity bounds of s
Thanks everyone for their help. Many interesting points and insights have been made by the answers I had on this topic.
The fact the the Cloud Console were reporting no errors led me to believe that the bottleneck was happening after the real request processing.
I found the reason why the results were not as expected: bandwidth.
Each response had a payload of roughly 1MB and thus responding to 500 simultaneous connections from the same client would clog the lines, resulting in timeouts. This was obviously not happening when requesting to the VM, where the bandwith is much larger.
Now GAE scaling is in line with what I expected: it successfully scales to accomodate each incoming request.