问题
I have been recently told that, cassandra truncate is not performant and it is anti pattern. But, I do not know why?
So, I have 2 questions:
Is it more performant to have upsert of all records then doing truncate?
Does truncate operation creates tombstones?
Cassandra Version: 3.x
回答1:
From the cassandra docs:
Note: TRUNCATE sends a JMX command to all nodes, telling them to delete SSTables that hold the data from the specified table. If any of these nodes is down or doesn't respond, the command fails and outputs a message like the following
So, running truncate will issue a deletion of all sstables belonging to your cassandra table, which will be quite fast but must be acknowledged by all nodes. Depending on your cassandra.yml this will snapshot your data before:
auto_snapshot (Default: true) Enable or disable whether a snapshot is taken of the data before keyspace truncation or dropping of tables. To prevent data loss, using the default setting is strongly advised. If you set to false, you will lose data on truncation or drop.
When creating or modifying tables, you enable or disable the key cache (partition key cache) or row cache for that table by setting the caching parameter. Other row and key cache tuning and configuration options are set at the global (node) level. Cassandra uses these settings to automatically distribute memory for each table on the node based on the overall workload and specific table usage. You can also configure the save periods for these caches globally.
To your question:
- upserts will be much slower (when there is significant data in your table)
- truncate does not write tombstones at all (instead it will delete all on all nodes for your truncated table sstables immediately)
来源:https://stackoverflow.com/questions/54071367/cassandra-truncate-performance