问题
using 'Camden.SR5' for spring-cloud-dependencies, with spring boot '1.5.2.RELEASE'.
In my current setup, I have
- eureka server
- config server (running on random ports)
- zuul gateway server
- and 2 instances of a service (running on random ports)
All these instances are successfully register with Eureka.
When all the services are running, The load balancing is done properly through zuul without any issues.
when an instance is killed, Zuul is still trying to fulfil the request using the same service which is down. However if waited till the eureka registry is fetched after shutting down the instance, requests are fulfilled with the other instances which are 'UP'.
2017-03-07 19:57:41.409 DEBUG 26658 --- [nio-5555-exec-3] c.n.l.reactive.LoadBalancerCommand : Got error org.apache.http.conn.HttpHostConnectException: Connect to 10.99.4.151:64381 [/10.99.4.151] failed: Connection refused when executed on server 10.99.4.151:64381
2017-03-07 19:57:41.420 DEBUG 26658 --- [nio-5555-exec-3] com.netflix.hystrix.AbstractCommand : Error executing HystrixCommand.run(). Proceeding to fallback logic ...
com.netflix.client.ClientException: null
at com.netflix.client.AbstractLoadBalancerAwareClient.executeWithLoadBalancer(AbstractLoadBalancerAwareClient.java:123) ~[ribbon-loadbalancer-2.2.0.jar:2.2.0]
at com.netflix.client.AbstractLoadBalancerAwareClient.executeWithLoadBalancer(AbstractLoadBalancerAwareClient.java:81) ~[ribbon-loadbalancer-2.2.0.jar:2.2.0]
at org.springframework.cloud.netflix.zuul.filters.route.support.AbstractRibbonCommand.run(AbstractRibbonCommand.java:96) ~[spring-cloud-netflix-core-1.2.5.RELEASE.jar:1.2.5.RELEASE]
at org.springframework.cloud.netflix.zuul.filters.route.support.AbstractRibbonCommand.run(AbstractRibbonCommand.java:42) ~[spring-cloud-netflix-core-1.2.5.RELEASE.jar:1.2.5.RELEASE]
at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75) ~[httpclient-4.5.3.jar:4.5.3]
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ~[httpclient-4.5.3.jar:4.5.3]
... 162 common frames omitted
2017-03-07 19:57:41.425 DEBUG 26658 --- [nio-5555-exec-3] com.netflix.hystrix.AbstractCommand : No fallback for HystrixCommand.
java.lang.UnsupportedOperationException: No fallback available.
at com.netflix.hystrix.HystrixCommand.getFallback(HystrixCommand.java:292) [hystrix-core-1.5.6.jar:1.5.6]
at org.springframework.cloud.netflix.zuul.filters.route.support.AbstractRibbonCommand.getFallback(AbstractRibbonCommand.java:117) ~[spring-cloud-netflix-core-1.2.5.RELEASE.jar:1.2.5.RELEASE]
at org.springframework.cloud.netflix.zuul.filters.route.support.AbstractRibbonCommand.getFallback(AbstractRibbonCommand.java:42) ~[spring-cloud-netflix-core-1.2.5.RELEASE.jar:1.2.5.RELEASE]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_66]
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-8.5.11.jar:8.5.11]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
2017-03-07 19:57:41.428 WARN 26658 --- [nio-5555-exec-3] o.s.c.n.z.filters.post.SendErrorFilter : Error during filtering
com.netflix.zuul.exception.ZuulException: Forwarding error
at org.springframework.cloud.netflix.zuul.filters.route.RibbonRoutingFilter.handleException(RibbonRoutingFilter.java:170) ~[spring-cloud-netflix-core-1.2.5.RELEASE.jar:1.2.5.RELEASE]
at org.springframework.cloud.netflix.zuul.filters.route.RibbonRoutingFilter.forward(RibbonRoutingFilter.java:145) ~[spring-cloud-netflix-core-1.2.5.RELEASE.jar:1.2.5.RELEASE]
at org.springframework.cloud.netflix.zuul.filters.route.RibbonRoutingFilter.run(RibbonRoutingFilter.java:88) ~[spring-cloud-netflix-core-1.2.5.RELEASE.jar:1.2.5.RELEASE]
Following are the zuul configuration used with @EnableZuulProxy and @EnableEurekaClient
server:
port: 5555
spring:
application:
name: gateway-server
cloud:
config:
discovery:
enabled: true
service-id: CONFIGSERVER
fail-fast: true
retry:
multiplier: 1.1
initial-interval: 1000
max-attempts: 6
max-interval: 2000
hystrix:
command:
default:
execution:
isolation:
thread:
timeoutInMilliseconds: 100000
timeout:
enabled: false
ribbon:
ReadTimeout: 5000
ConnectTimeout: 3000
maxAutoRetries: 1
MaxAutoRetriesNextServer: 2
OkToRetryOnAllOperations: true
logging:
level:
ROOT: DEBUG
zuul:
routes:
security-service:
retryable: true
The 2 instances of service with are running with unique instance-ids
@EnableEurekaClient
@EnableHystrix
@SpringBootApplication
public class SecurityServer implements HealthIndicator{
public static void main(String args[])
{
SpringApplication.run(SecurityServer.class,args);
}
@Override
public Health health() {
return Health.up().withDetail("STATUS", "SUCCESS").build();
}
}
instanceId: ${spring.cloud.client.hostname}:${spring.application.name}:${spring.application.instance_id:${random.uuid}}
Can you help me with the zuul & instances configuration, so that request is automatically forwarded to the other available instances when an instance goes down.
回答1:
upon searching out more and looking into spring-cloud-netflix issue tracker, There is a wonderful discussion between william-tran and ryanjbaxter on the best practices. Thanks to both of you.
https://github.com/spring-cloud/spring-cloud-netflix/issues/1290#issuecomment-242204614
https://github.com/spring-cloud/spring-cloud-netflix/issues/1295
In summary, Camden doesn't use the Ribbon HTTP Client(deprecated) so none of the ribbon.* properties will help you control the retry logic. Camden uses Apache HTTP client.
So the solution would be to use Ribbon HTTP Client in camden version using below configuration
ribbon.restclient.enabled=true
or
Move to Camden.BUILD-SNAPSHOT or Dalston.BUILD-SNAPSHOT for using spring-retry (https://github.com/spring-projects/spring-retry)
回答2:
A very good article is here by Ryan Baxter on the fixes for this issue in Brixton and Camden release.
来源:https://stackoverflow.com/questions/42651456/spring-cloud-zuul-retry-when-instance-is-down-and-forward-to-other-available-ins