cassandra node limitations

前端 未结 5 1596
予麋鹿
予麋鹿 2021-02-08 02:58

I am looking for if cassandra has limitations of node hardware spec like what could be the max storage per node if there is any such limitation.

I intend to use couple o

相关标签:
5条回答
  • 2021-02-08 03:10

    Datastax, who is the principal vendor recommends 3 to 5 To per node

    See here:

    https://docs.datastax.com/en/cassandra/1.2/cassandra/architecture/architecturePlanningHardware_c.html

    0 讨论(0)
  • 2021-02-08 03:12

    Cassandra distributes its data by row, so the only hard limitation is that a row must be able to fit on a single node.

    So the short answer is no.

    The longer answer is that you'll want to make sure that you're setting up a separate storage area for your permanent data and your commit logs.

    One other thing to keep in mind is that you'll still run into seek speed issues. One of the nice things about Cassandra is that you don't need to have a single node with that much data (and in fact its probably not well advised, you're storage will outpace your processing power). If you use smaller nodes (hard drive space wise) then your storage and processing capabilities will scale together.

    0 讨论(0)
  • 2021-02-08 03:20

    There are some notes here about large data set considerations.

    48 TB of data per node is probably way too much. It will be much better to have more nodes with smaller amounts of data. Periodically you need to run nodetool repair, which involves reading all the data on the machine. If you are storing many terabytes of data on a machine, this will be very painful.

    I would limit each node to around 1TB of data.

    0 讨论(0)
  • 2021-02-08 03:27

    You should also be careful using large amounts of RAM with Cassandra. RAM is great for caching the data in SSTables, but giving the JVM too much heap space is counter-productive. Don't give the JVM much more than 12 GB of heap space, otherwise garbage collection will take too long and hinder performance. This is another reason why having more smaller nodes is better in Cassandra.

    0 讨论(0)
  • 2021-02-08 03:33

    See How much data per node in Cassandra cluster?

    which suggests that between 1-10 TB per node is sensible, depending on your application. Cassandra will probably still work with 48TB, but not optimally.

    Do you intend to use replication factor of 1, or 2 (if you have 2 nodes as stated above)?

    Some operations (repair, compaction) may be extremely slow with that much data on a single node.

    0 讨论(0)
提交回复
热议问题