We have 6 node cluster where we deploy everything to one region on AWS with 3 Availability Zones. We are using Ec2Snitch which should distribute one replica in each availability zone. We use DataStax Java driver. Servers doing write and read are distributed in availability zones same as nodes are (1 server by AZ). What we want to achieve is best possible read performance, write for us is not that important in a sense that we need to write data but not necessary fast. We use replication factor 3 but read and write with consistency level ONE.
We are investigating shuffle replicas in TokenAwarePolicy. It is said in DataStax Java Driver that it can increase read performance but decrease write distribution.
First question is about shuffleReplicas
implementation, I followed implementation of newQueryPlan
method and what I figured out is that for replicas LinkedHashSet
is used meaning that primary replica will be always preferred to non primary replica.
// Preserve order - primary replica will be first
Set<Host> replicas = new LinkedHashSet<Host>();
Just to confirm, that will mean that driver will always prefer to connect to node where primary replica is, to have it as coordinator, if we set shuffleReplicas
to false, which can create hot spots?
Second question is about idea to separate connection to cluster, and for writes use shuffleReplicas
on true, which will distribute evenly tokens across cluster and for read to use TokenAwarePolicy
with shuffleReplicas
on false to gain best possible reads, is this idea viable and do you see any problems with it?
We would like to have reads always from same availability zone to gain maximum possible speed while reading data. Is this better approach then leaving shuffleReplicas
on true and letting cluster choose coordinator evenly. Idea can be also to use WhiteListPolicy
which will select only nodes from same AZ to servers placed in that AZ which will result in local read but that can create hot spots.
Just to confirm, that will mean that driver will always prefer to connect to node where primary replica is, to have it as coordinator, if we set shuffleReplicas to false, which can create hot spots?
Yes. Note however that this creates hot spots only if all your partition keys map to the same replica; if your partition keys are evenly distributed across the token ring, it should be fine.
Second question is about idea to separate connection to cluster, and for writes use shuffleReplicas on true, which will distribute evenly tokens across cluster and for read to use TokenAwarePolicy with shuffleReplicas on false to gain best possible reads, is this idea viable and do you see any problems with it?
The main problem I see is that the driver is not capable of telling if a request is a "read" or a "write", so you will have to either write your own load balancing policy, or use two separate Cluster
instances, one for reads, one for writes.
Otherwise, setting shuffleReplicas
to false
doesn't necessarily mean you will get "best possible reads". The main effect to consider when using shuffleReplicas
is eventual consistency; when shuffleReplicas
is true, it is possible to read stale values, e.g. if you write to replica 1 with consistency ONE, then read from replica 2 with consistency ONE. I usually recommend to set shuffleReplicas
to true
for both reads and writes to spread the load evenly on your cluster, and adjust your consistency levels to get the best balance between throughput vs risk of reading stale values.
来源:https://stackoverflow.com/questions/35843798/cassandra-load-balancing-with-tokenawarepolicy-and-shufflereplicas