问题
As you know, Cassandra cluster have replication to prevent data loss even if some node in the cluster down. But in the case that an admin accidentally drop a table with big amount of data, and that command had already executed by all the replica in cluster, is this means you lost that table and cannot restore it? Is there any suggestion to cope with this kind of disaster with short server down time?
回答1:
From cassandra docs:
auto_snapshot (Default: true ) Enable or disable whether a snapshot is taken of the data before keyspace truncation or dropping of tables. To prevent data loss, using the default setting is strongly advised. If you set to false, you will lose data on truncation or drop.
回答2:
If the administrator has been deleted the data and replicated in all the nodes it is difficult to recover the data without a consistent backup.
Maybe considering that the deletes in cassandra are not executed instantly you can recover the data. When you delete data, cassandra replace the data with a tombstone.The tombstone can then be propagated to replicas that missed the initial remove request.
See http://wiki.apache.org/cassandra/DistributedDeletes
Columns marked with a tombstone exist for a configured time period (defined by the gc_grace_seconds value set on the column family), and then are permanently deleted by the compaction process after that time has expired. The default value is 10 days.
Following the explanation in About Deletes maybe if you shutdown some of the nodes and wait until the compaction succeed and the data is completely delete from the SSTables and then turn on again the nodes the data could appear again. But this will only happen if you dont make periodical repair operations on the node.
I have never tried this before, it is only an idea that comes to me reading the cassandra documentation.
来源:https://stackoverflow.com/questions/24503614/restore-cassandra-cluster-data-when-acccidentally-drop-table