How to know affected rows in Cassandra(CQL)?

你离开我真会死。 提交于 2019-12-08 17:30:25

问题


There doesn't seem to be any direct way to know affected rows in cassandra for update, and delete statements.

For example if I have a query like this:

DELETE FROM xyztable WHERE PKEY IN (1,2,3,4,5,6);

Now, of course, since I've passed 6 keys, it is obvious that 6 rows will be affected.

But, like in RDBMS world, is there any way to know affected rows in update/delete statements in datastax-driver?

I've read cassandra gives no feedback on write operations here.

Except that I could not see any other discussion on this topic through google.

If that's not possible, can I be sure that with the type of query given above, it will either delete all or fail to delete all?


回答1:


In the eventually consistent world you can look at these operations as if it was saving a delete request, and depending on the requested consistency level, waiting for a confirmation from several nodes that this request has been accepted. Then the request is delivered to the other nodes asynchronously. Since there is no dependency on anything like foreign keys, then nothing should stop data from being deleted if the request was successfully accepted by the cluster.

However, there are a lot of ifs. For example, deleting data with a consistency level one, successfully accepted by one node, followed by an immediate node hard failure may result in the loss of that delete if it was not replicated before the failure.

Another example - during the deletion, one node was down, and stayed down for a significant amount of time, more than the gc_grace_period, i.e., more than it is required for the tombstones to be removed with deleted data. Then if this node is recovered, then all suddenly all data that has been deleted from the rest of the cluster, but not from this node, will be brought back to the cluster.

So in order to avoid these situations, and consider operations successful and final, a cassandra admin needs to implement some measures, including regular repair jobs (to make sure all nodes are up to date). Also applications need to decide what is better - faster performance with consistency level one at the expense of possible data loss, vs lower performance with higher consistency levels but with less possibility of data loss.




回答2:


There is no way to do this in Cassandra because the model for writes, deletes, and updates in Cassandra is basically the same. In all of those cases a cell is added to the table which has either the new information or information about the delete. This is done without any inspection of the current DB state.

Without checking the rest of the replicas and doing a full merge on the row there is no way to tell if any operation will actually effect the current read state of the database.

This leads to the oft cited anti-pattern of "Reading before a write." In Cassandra you are meant to write as fast as possible and if you need to have history, use a datastructure which preservations a log of modifications rather than just current state.

There is one option for doing queries like this, using the CAS syntax of IF value THEN do other thing but this is a very expensive operation compared normal write and should be used sparingly.



来源:https://stackoverflow.com/questions/28611459/how-to-know-affected-rows-in-cassandracql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!