Storing a list of values in Cassandra

后端未结

关注

 3  1052

Version Dependent

Some of the answers to this question deal with older versions of Cassandra. The correct answer for this kind of problem depends on the version of

相关标签:

3条回答

-上瘾入骨i

2021-01-13 08:58
This answer dates to before the release of Cassandra version 1.2, which provided substantially different functionality for handling lists. The answer might be inappropriate if you are using Cassandra 1.2+.

I would encode lists in the column key, using composite columns with the real column name as the first dimension, ie:
```
row_key -> {
     [column_name; entry1] -> "",
     [column_name; entry2] -> "",
     ... 
}
```
Then, to read the list, you would need to do a get_slice from [column_name; ] to [column_name; ] - note the empty dimensions.

The great thing about this is it actually implements a set quite nicely; the list cannot contains the same thing twice. I think thins works in your usecase. The list would also be maintained in sorted order.
0 讨论(0)
发布评论:

提交评论
- 加载中...
庸人自扰

2021-01-13 09:00
In older versions of Cassandra, you had to serialize the list yourself and store it in a column, or perhaps use a super column.

Since version 1.2 of Cassandra, CQL3 has collection types for columns, so you can give list<text> as the type of a column in your schema. For example:
```
 CREATE TABLE Person (
    name text,
    skills list<text>,
    PRIMARY KEY (name)
 );
```
Or you could use set<text> if you want to automatically eliminate duplicates.
0 讨论(0)
发布评论:

提交评论
- 加载中...
予麋鹿

2021-01-13 09:04
This answer dates to before the release of Cassandra version 1.2, which provided substantially different functionality for handling lists. The answer might be inappropriate if you are using Cassandra 1.2+.

As mentioned on the mailing list, my preference which has worked very well for me, is to store a single column "skills" with the value being a serialized JSON string.

Really comes down to the usage patterns you have for "skills".
- If "skills" are just for CRUD on a per user basis, this is fine.
- If you want to be able to search for all users that have a skill of "cobol", then I would still recommend this approach and have another row that is skill:cobol that has a column of UUID and a value of timestamp or something similar ...
- I'm sure with Pig/Hadoop integration to your cassandra nodes, you could also still quite happily query all of the users that have x,y and z to generate new data to support additional use cases.
0 讨论(0)
发布评论:

提交评论
- 加载中...