I was loading a large CSV into Cassandra using cassandra-loader.
The VM ran out of disk space during this process and crashed. I allocated more disk space to the VM
Having some replication would surely help you to fix this without data loss but it would come with a price.
Despite all your effort you cannot manage to recover your corrupted sstable. So you decide to remove it from your file system to start Cassandra again. If you do not have replication your data is lost. But if you have replication on the cluster, you can possibly fetch the data from other nodes. That is what nodetool repair
do !
So nodetool repair
does not repair corrupted sstable. Basicallynodetool repair
compare tables from node to node to find missing or inconsistent data and then repair it. You can find more information on how it works here.
However nodetool repair
is very expensive, it is long and uses a lot of cpu, disk and network. There is this good post about repair benefits and drawbacks.
This is how I fixed the problem with commit logs. You should only do this if you don't care about preserving the state of your commit logs.
Try to restart cassandra using
sudo systemctl restart cassandra
Then I check
systemctl status cassandra
and see that the status is 'exited' so there is a problem. Check the logs for cassandra using
sudo less /var/log/cassandra/system.log
and see org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: Could not read commit log descriptor in file /var/lib/cassandra/commitlog/CommitLog-6-1498210233635.log
Because I don't care about preserving the state of Cassandra I delete all of the commit logs and it now boots up fine
sudo rm /var/lib/cassandra/commitlog/CommitLog*
sudo systemctl restart cassandra
systemctl status cassandra
(should confirm that it it now running)
Since you don't care about the data, removing files from \data\commitlogs should be easiest solution.
Simply goes in log directory in cassandra and delete the log files. It work fine....