问题
I've two EC2 instances behind an ELB and in an Auto Scaling Group. The Scale-up policy is as below:
CPUUtilization >= 70 for 300 seconds (Adds one server)
While Atoscaling activity is taking place, load on existing instances is already 99% and connections are being dropped.
Is there any way to handle this more efficiently?
回答1:
The trick to Auto Scaling is in defining an alarm that can accurately identify the load of your system.
CPU Utilization is not always the right measure to use -- your application might only be able to handle a limited number of connections, it might be squeezed on RAM and the types of requests might vary too.
A good idea is to monitor your system closely during peak loads to determine an accurate signal that identifies busy periods (or, even better, helps you predict impending busy periods). Use standard monitoring tools on your individual instances, such as monitoring free memory, number of application users, number of transactions, etc.
You can use normal monitoring tools, or you can write something that pushes metrics to Amazon CloudWatch, so that you go beyond the basic CPU and Network metrics that CloudWatch normally provides. You could even use the Load Balancer's Latency metric to trigger scaling when the application slows down (custom code required).
Once you have a reliable signal to detect when the system is approaching capacity and needs to scale-out, you can then concentrate on shortening the time to add new capacity. Measure the time it takes for a new instance to launch and start accepting traffic. Try to reduce launch times by using a fully-configured AMI rather than installing software via User Data. Maybe you can remove or turn-off services on the instance to make it start faster. Try using different EBS volume types (eg General Purpose SSD can burst up to 3000 IOPs) and different Instance Types.
Perhaps even scale-out earlier (eg at 50%) -- the extra expense could be minor compared to the improved service to your users.
Your goal should be ensuring that users never have slow service or dropped connections.
回答2:
Check the status of your instances whether they are healthy or not. Usually instances / container instances go into unhealthy / draining static if there is sudden overwhelming traffic.
In that case, set up an autoscaling policy to detect the minimum number healthy instances being dropped and accordingly scale your instances to atleast 50% more of your minimum threshold. For example, a system running with 30 healthy instances suddenly drops to 25, then autoscale to additional 45 instances so that it can handle that sudden spike.
Later as the traffic cools down, your scale down policy on metrics like CPU, Memory can bring those 45 instances down to required number.
来源:https://stackoverflow.com/questions/44312068/how-to-handle-a-sudden-spike-in-web-traffic-during-autoscaling