问题
We plan to add a column of type list to an existing cassandra table which data file size is about 350 GB. We can temporarily halt all the read/write for a few minutes while applying the schema change.
Our understanding is that cassandra does not lock a table when applying schema changes, but to be sure our DBA wants to do an experiment on a table with datafile at 25 GB in size. However it will take 3-4 weeks to grow in such size on a small server where a non-production cassandra server is running (having more concurrent inserts starts to cause time out issues).
Does anyone know that adding a column to an existing cassandra table returns promptly regardless the underlying data file size?
Thanks
回答1:
Adding a column in Cassandra is just an addition of the column's meta-information to internal table that keeps schema information. No modification of existing data happens when this change is done - Cassandra will simply put null instead of the column value when there is no data for it on the disk (for any column, not only what was added) - this happens when the data is returned to caller, not by adding null to the files.
Similarly, deletion of the column doesn't modify the existing data - instead a new entry is added to system_schema.dropped_columns
table, and corresponding data is filtered out after they are read from the disk.
来源:https://stackoverflow.com/questions/61469356/does-adding-a-column-to-a-cassandra-table-complete-instantly