I\'m trying to improve the performance of writing data to redis cluster. We are planning to move from redi-sentinel to cluster mode for scalability.
But, the perfor
but cluster mode doesn't support pipeline
WRONG!
With a single pipeline, you can only send multiple commands to the same connection to the same node. It has nothing to do with whether this node is a single instance or a member of a Redis Cluster
.
So your problem should be With a single pipeline, we CANNOT send multiple commands with keys distributed on multiple slots. In order to solve that, you want those keys to be located in the same slot. How can we achieve that?
how to know/compute (before writing to cluster) to which node/slot a particular key would be written to
You don't need to do the math yourself. You can use Hash Tags
to force multiple keys to be part of the same hash slot.
So you only need to rename those keys, which you want to be located in the same slot, with the same Hash Tags
. e.g. rename user-name
and user-age
to {user-id}user-name
and {user-id}user-age
See Hash Tags
doc for details.
Solution 1:
Found a solution to identify the slot to which keys would go into. JedisCluster has some APIs to get it.
int slotNum = JedisClusterCRC16.getSlot(key);
- Provides the slot number of the key.
Set<HostAndPort> redisClusterNode = new HashSet<HostAndPort>();
redisClusterNode.add(new HostAndPort(hostItem, port));
JedisSlotBasedConnectionHandler connHandler = new
JedisSlotBasedConnectionHandler(redisClusterNode, poolConfig, 60);
Jedis jedis = connHandler.getConnectionFromSlot(slotNum);
This provides the jedis object (from Jedispool internally) for the specific node in the cluster.
Now with the above jedis object all commands can be easily pipelined for the specific node (in cluster)
Pipeline pipeline = jedis.pipelined();
pipeline.multi();
for(Entry<String, Map<String, String>> kvf : kvfs.entrySet()) {
pipeline.hmset(kvf.getKey(), kvf.getValue());
}
pipeline.exec();
Despite this approach (with JedisCluster) gave the appropriate node to which the keys go to this didn't provide me the expected performance, I think it's due to the procedure involved in knowing slot number and node (of the slot).
Above procedure seems to establish a physical connection to the node (in cluster) every time we try to get the actual node (jedis) that contains the slot number. So, this hinders the performance in-case we have millions of keys.
So, another approach (below) using Lettuce package helped me to over come this.
Solution 2:
Used Lettuce package that supports sending batch of commands in cluster mode.
<groupId>biz.paluch.redis</groupId>
<artifactId>lettuce</artifactId>
<version>4.4.3.Final</version>
Code snippet:
RedisClusterClient client = RedisClusterClient.create(RedisURI.create("hostname", "port"));
StatefulRedisClusterConnection<String, String> connection = client.connect();
RedisAdvancedClusterAsyncCommands<String, String> commands = connection.async();
// Disabling auto-flushing
commands.setAutoFlushCommands(false);
List<RedisFuture<?>> futures = new ArrayList<>();
// kvf is of type Map<String, Map<String, String>>
for (Entry<> e : kvf.entrySet())
{
futures.add(commands.hmset( (String) e.getKey(), (Map<String, String>) e.getValue()));
}
// write all commands to the transport layer
commands.flushCommands();
// synchronization example: Wait until all futures complete
LettuceFutures.awaitAll(10, TimeUnit.SECONDS,
futures.toArray(new RedisFuture[futures.size()]));
Ref: https://github.com/lettuce-io/lettuce-core/wiki/Pipelining-and-command-flushing