问题
Anyone have any insight into how GitHub deals with the potential failure or temporary unavailability of a Redis server when using Resque?
There are others that seem to have put together semi-complicated solutions as a holdover for redis-cluster using zookeeper (see https://github.com/ryanlecompte/redis_failover and Solutions for resque failover redis). Others seem to have 'poor mans failover' that switches the slave to the master on first sight of connectivity issues without coordination between redis clients (but this seems problematic in the temporary unavailability scenario).
The question: Has Defunkt ever talked about how GitHub handles Redis failure? Is there a best practice for failover that doesn't involve zookeeper?
The original post on resque states part of the rational for the selection of Redis was the master-slave capability of redis, but the post doesn't describe how GitHub leverages this since all workers need both read+write access to Redis (see https://github.com/blog/542-introducing-resque).
回答1:
The base Resque library does not handle failures. If a box dies immediately after poping off a message, the message is gone forever. You'll have to write your own code to handle failures, which is quite tricky.
https://github.com/resque/resque/issues/93
来源:https://stackoverflow.com/questions/10590038/githubs-redis-and-resque-failure-behavior