How should I copy a keyspace within a cluster

前端未结

关注

 1  1060

I have a keyspace populated with data that was expensive to generate. I want two copies of this data within my cluster. I would like to end up with two keyspaces: lets call

相关标签:

1条回答

渐次进展

2021-01-05 01:59
" Is there a better way?"

All Cassandra data are stored in the data/ folder (check config value data_file_directories in cassandra.yaml). You may also check the saved_caches_directory and commitlog_directory config.

Inside the data folder, you'll have
1. One folder per keyspace
2. One folder for system keyspace
3. Some folder for authentication etc..
  
  Inside each keyspace folder, you'll have
4. *-Data.db files which contain your real data
5. *-Filter.db files
6. *-Index.db files for index
7. ...
To replicate data, you do a plain copy of those folders.

In our team, the ops use a crontab to schedule regular backup of Cassandra data this way.

Note: sometimes, you may miss live data which are still in memory or in memtable and not flushed yet to disk. You can trigger a full compaction before backuping data files. But full compaction may hurt you perf so be careful

Better answer: use the provided tool to take a snapshot of you DB:

http://www.datastax.com/docs/1.0/operations/backup_restore
0 讨论(0)
发布评论:

提交评论
- 加载中...