问题
I have a Cassandra 3-node cluster and a keyspace created with a replication_factor
of 3.
I make my backups for this keyspace with nodetool snapshot
. As recommended by Cassandra documentation, to make a global backup I start it with a cron job on each node (3 nodes are NTP synchronized). I'm not using incremental snapshots, it's always a new global snapshot.
Unfortunately, I've some troubles with the restore process.
First of all, I've set a replication factor to 3 (and QUORUM
level of consistency on READ and WRITE operations) to make sure my app keeps working even if 1 node is down.
My first scenario is not really a restore process: one node goes down because of, let's say the someone or something shutdown the VM that the node was running on. The 2 others nodes keep working and receiving write/read requests. 24 hours later, I manage to restart the VM of the first node, all services and files are still there, and I'm about to restart the node. Are there any actions that I should do before or after the restarting?
Second scenario is pretty much the same, but I was not able to recover the VM of the first node and I need to reinstall everything on it, including Cassandra. How should I use my backup to resync this node? Should I even use it or is Cassandra capable to resync everything without me having to restore anything? What should I do precisely in this case?
My last scenario is different. I've lost all my nodes and cannot recover anything. I've my global snapshot (3 snapshots, 1 for each node, taken at the same time). What is the process in this case?
I've read the Cassandra documentation for the restore process, and I've a preference for the simple copy-restore (in other words, I rather not use sstableloader
). I've troubles to understand when I should use refresh
and/or repair
commands in those scenarios.
回答1:
I've troubles to understand when I should use refresh and/or repair commands in those scenarios
According to documentation you should perform refresh
when you restore data from a snapshot, the 2nd and the 3rd scenarios.
I suppose repair is not required step for all three scenarios. But I would recommend perform it because it is easy and useful step to have consistent data on just restored nodes.
Furthermore repair
on a regular basis is a recommended part of cassandra cluster maintenance.
来源:https://stackoverflow.com/questions/40953885/handle-different-restore-scenarios-with-cassandra-2-2