My application makes Web Service requests; there is a max rate of requests the provider will handle, so I need to throttle them down.
When the app ran on a single server
Hystrix was designed for pretty much the exact scenario you're describing. You can define a thread pool size for each service so you have a set maximum number of concurrent requests, and it queues up requests when the pool is full. You can also define a timeout for each service and when a service starts exceeding its timeout, Hystrix will reject further requests to that service for a short period of time in order to give the service a chance to get back on its feet. There's also real time monitoring of the entire cluster through Turbine.