发表新帖

发表新帖

Cassandra cluster - data density (data size per node) - looking for feedback and advises

前端未结

关注

 3  659

攒了一身酷 2021-02-08 03:57

I am considering the design of a Cassandra cluster.

The use case would be storing large rows of tiny samples for time series data (using KairosDB), data will be almost i

3条回答

猫巷女王i (楼主)

2021-02-08 04:34

I would recommend to think about the data model of your application and how to partition your data. For time series data it would probably make sense to use a composite key [1] which consists of a partition key + one or more columns. Partitions are distributed across multiple servers according to the hash of the partition key (depending on the Cassandra Partitioner that you use, see cassandra.yaml).

For example, you could partition your server by device that generates the data (Pattern 1 in [2]) or by a period of time (e.g., per day) as shown in Pattern 2 in [2].

You should also be aware that the max number of values per partition is limited to 2 billion [3]. So, partitioning is highly recommended. Don't store your entire time series on a single Cassandra node in a single partition.

[1] http://www.planetcassandra.org/blog/composite-keys-in-apache-cassandra/

[2] https://academy.datastax.com/demos/getting-started-time-series-data-modeling

[3] http://wiki.apache.org/cassandra/CassandraLimitations

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题