问题
I have a 4 node setup in Cassandra and decided to go with the following configuration, but ppl are saying this will be same as 3 node setup, So could somebody please give me a light and say why,
Nodes = 3, Replication Factor = 2, Write Consistency = 2, Read Consistency = 1
Nodes = 4, Replication Factor = 3, Write Consistency = 3, Read Consistency = 1
As per my understanding, Nodes = 4, provide the two node failure, It is beneficial to have RF as '3' but ppl are saying RF = 2 will be same as RF = 3 in a 4 node setup, Could you please explain why?
Thanks, Harry
回答1:
Your question is little bit unclear as I think you haven't properly constructed the question. But I will try to explain some points that would help you to understand it.
Nodes = 4, provide the two node failure, It is beneficial to have RF as '3'
- Number of nodes is not counting factor for read/ write failures. RF (Replication Factor) and CL (Consistency Level) are the deciding factors for read/write failures (if required replicas or nodes are down).
RF -> How many copies of data (row) will be kept. (How many servers or nodes will keep the same row/data).
CL -> Acknowledgement of how many nodes is required to let client know/inform that write/read operation is successful. That means at least numbers of nodes mentioned as CL (Ex: If CL is 2 at least 2 nodes) have to acknowledge/ensure that they have written the data successfully or the data is read from those replicas (wait until all the required replicas return the result to the coordinator node) and merge the results (keep the latest data if different nodes have different updates of same data) and successfully return results to the client.
Note: If RF = CL, then you have used CL equivalent to ALL.
ALL is highest level of consistency level (Data will be up-to-date for sure but not be available if a single replica is down)
Scenario 1:
Nodes = 3, Replication Factor = 2, Write Consistency = 2, Read Consistency = 1
For Write operation:
As you have used highest level of write CL (RF and write CL value is same), then this will be a case of single point of failure. All required replicas have to be alive to acknowledge the client that data has been written successfully in both nodes.
For Read operation:
Read CL is ONE. So it can survive if one replica is down. Cause only one replica needs to return the result to the client. It may be old data (If update of data is not still propagated to this node, but eventually it will be consistent), but read will be successful.
Scenario 2:
Nodes = 3, Replication Factor = 3, Write Consistency = 2, Read Consistency = 1
For Write operation:
As number of nodes = RF, all data will be copied in all nodes (100% own). It will survive one node/replica down.
For Read operation: It can survive if two replicas are down.
Scenario 3:
Nodes = 4, Replication Factor = 2, Write Consistency = 2, Read Consistency = 1
For Write operation:
Same as scenario 1.
For Read operation:
Same as scenario 1.
Scenario 4:
Nodes = 4, Replication Factor = 3, Write Consistency = 3, Read Consistency = 1
For Write operation:
Same as scenario 1.
For Read operation:
Same as scenario 2.
Related Link:
Understand cassandra replication factor versus consistency level
For details follow DataStax Doc.
Edited
Number of nodes does not matter if you are concerned for node failure scenario (read or write requests fail).
Assume you have 3/4/5 nodes, if RF is 3 and CL is QUORUM (3/2 + 1 ~ 2), the cluster can tolerate 1 replica nodes down. Please read the About the QUORUM level
section from above link.
If you have more nodes, the cluster can handle more data or load and distribute the data properly among nodes. But the request fail over scenario will be the same.
Nodes = 3, Replication Factor = 3, Write Consistency = 2, Read Consistency = 1
Nodes = 4, Replication Factor = 3, Write Consistency = 2, Read Consistency = 1
Nodes = 5, Replication Factor = 3, Write Consistency = 2, Read Consistency = 1
As RF is 3 and Write and Read CL is 2 and 1 respectively, cluster can tolerate one replica down for write and two replicas down for a read operation. I hope this helps you.
来源:https://stackoverflow.com/questions/47468639/4-node-setup-in-cassandra-is-as-same-as-3-node-setup