问题
we have a scenario where a table in cassandra which has over million records and we want execute a bulk update on a column(basically set the column value to null in entire table).
is there a way to do so since below query won't work in CQL
UPDATE TABLE_NAME SET COL1=NULL WHERE PRIMARY_KEY IN(SELECT PRIMARY_KEY FROM TABLE_NAME );
P.S - the column is not a primary key or a cluster key.
回答1:
There has been a similar question the other days regarding Deleting a column in cassandra for a large dataset...I suggest also reading the section Dropping a column from the Alter table documentation.
One solution in this case might be dropping the column and re-adding it since
If you drop a column then re-add it, Cassandra does not restore the values written before the column was dropped. A subsequent SELECT on this column does not return the dropped data.
I would test this on a test system beforehand and I would check if the tombstones have been removed.
回答2:
There really isn't a way to do this through CQL short of iterating through each row and updating the value.
However, there might be a way to do this if you feel adventurous.
You could use COPY in cqlsh to output the data of the table to a file. With a tool like sed you can modify this text file to change the columns and then import that same file back into cassandra.
This solution is less than optimal and might not work for certain datasets, but it gets the job done.
Personally I would still prefer iterating over doing this.
来源:https://stackoverflow.com/questions/51635049/run-a-bulk-update-query-in-cassandra-on-1-column