What algorithms there are for failover in a distributed system?

前端 未结 5 1053
北荒
北荒 2021-01-30 01:10

I\'m planning on making a distributed database system using a shared-nothing architecture and multiversion concurrency control. Redundancy will be achieved through asynchronous

5条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-01-30 02:00

    You are asking an absolutely massive question, and a lot of what you want to know is still in active research.

    Some thoughts:

    • Distributed systems are difficult, because there are no foolproof systems to deal with failures; in an asynchronous system, there is no way to be sure that a node is down or whether there is network delay. This may sound trivial, but it really isn't.
    • Achieving consensus can be done by the Paxos family of algorithms, versions of which are used in Google's bigtable, and in other places.

    You'll want to delve into a distributed systems textbook (or several). I like Tannenbaum's Distributed Systems: Principles and Paradigms

提交回复
热议问题