Background: I\'ve deployed an MVC3 application to 2 Azure Web Role instances, but I\'m confused as to how I can test out the possibility of one of these instances failing.
If you have RDP access enabled to your instances you can very easily remove one or more instance out from LoadBalancer even when the instance is running healthy without writing any line of code. You just need to RDP to your instance and then use PowerShell scripts to take the instance off from loadbalancer. In my following blog I have described the exact procedure:
http://blogs.msdn.com/b/avkashchauhan/archive/2012/01/27/windows-azure-troubleshooting-taking-specific-windows-azure-instance-offline.aspx
The above details also help to run load testing by removing N instances from total M instances .
If you use the StatusCheck event to set the status to Busy, your instance won't receive any more requests from the loadbalancer. You might want to write some code to send messages to your instances that will set them as busy for some time (using queues for example).
public override bool OnStart()
{
RoleEnvironment.StatusCheck += RoleEnvironmentStatusCheck;
return base.OnStart();
}
// Use the busy object to indicate that the status of the role instance must be Busy
private volatile bool busy = true;
private void RoleEnvironmentStatusCheck(object sender, RoleInstanceStatusCheckEventArgs e)
{
If (this.busy)
{
// Sets the status of the role instance to Busy for a short interval.
// If you want the role instance to remain busy, add code to
// continue to call the SetBusy method
e.SetBusy();
}
}
Reference: http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.serviceruntime.roleenvironment.statuscheck.aspx
Besides that you can also simply reboot your instance (using the Windows Azure portal or Remote Desktop).
Steve Marx (aka smarx) developed a tool called WazMonkey. It is the azure counterpart of the tool Chaos Monkey developed by the netflix team to simulate broken instances in amazon AWS. WazMonkey terminates instances of a windows azure cloud service randomly to test the resilience of a cloud applications.