What is the impact of increase in no of column families of cassandra on heap utilization?

前端 未结 2 1812
-上瘾入骨i
-上瘾入骨i 2021-01-26 11:19

We are using cassandra-1.1.

For some optimization purpose we decide to increase no of column families in our keyspace.

Will it have any impact on heap utilizatio

相关标签:
2条回答
  • 2021-01-26 11:36

    Recent versions of Cassandra allocate a minimum of 1MB on the heap for each column family, so you can treat that as the lower bound for heap consumption. Bloom filters also take up heap space in a way that doesn't necessarily depend on how much you use the column family.

    Are you talking about going from 5 to 10 column families? Or 10 to 1000? You certainly can run out of heap space with 10 or 1000 column families, it just depends a lot on the rate at which you're inserting data.

    0 讨论(0)
  • 2021-01-26 11:44

    As per Cassandra Wiki, the heap size consumed 'per node' is defined as: memtable_throughput_in_mb * 3 * number of hot CFs + 1G + internal caches (ref: MemtableThresholds)

    So to answer the first question: Will it have any impact on heap utilization? Yes.

    Regarding q2, I strongly believe there is no possibility of OOM with latest versions. As you mentioned version 1.1 of Cassandra, the per-CF config memtable_throughput_in_mb is replaced by a global memory configuration - memtable_total_space_in_mb. This config is equivalent to - memtable_throughput_in_mb * number of hot CFs, in the formula mentioned above. This ensures that the JVM heap size does not scale with number of CFs and is always guarded by a global config.

    0 讨论(0)
提交回复
热议问题