问题
We are currently operating on Apache Kafka 0.10.1.1. We are migrating to Confluent Platform 5.X. The New cluster is setup completely on different set of physical nodes.
While we are already working on upgrading the API(s), our application uses spring-boot
, we are trying to figure out how do we migrate the messages? I need to maintain the same ordering of messages in the Target Cluster.
- Can I simply copy the messages?
- Do I need to republish the messages to Target cluster for successful retention?
- What else can be done?
回答1:
Confluent includes a tool called Replicator, that while being an enterprise feature, you can use for 30 day trial to perform data migrations.
But essentially, yes, the only thing you can do is consume from one cluster, and produce into another. You might get duplicated data at the destination under less than optimal network conditions, but that's just a tradeoff of using the platform.
FWIW, I would suggest adding the Confluent 3.x matching components to the existing cluster first, if possible. Or even just do rolling upgrade of the brokers alone, first. My point being, there's nothing to "migrate to Confluent" as Kafka isn't what's changing, you'd only be adding other processes around it, like the Schema Registry or Control Center
回答2:
Assuming the topic definition in the new cluster is exactly the same (i.e: nbr of partitions, retention, etc..) and the Producer hashing function on the message key will deliver your message to the same partition (will be a bummer if you have null keys because it'll end up in a random partition), you can simply consume from earliest from your old kafka broker topic and produce to your new topic in the new cluster using a custom consumer/producer or some tool like logstash
.
If you want to be extra sure to get the same ordering, you should only use only one consumer per topic and if your consumer supports single threaded run, even better (might avoid racing conditions).
You might also try more common solutions like MirrorMaker but be advised that MirrorMaker ordering guarantees amount to:
The MirrorMaker process will, however, retain and use the message key for partitioning so order is preserved on a per-key basis.
Note: As stated in the first solution and as cricket_007 said, it will only work if you were using the default partitioner and wish to keep using it in the new cluster.
In the end, if everything goes OK, you can manually copy your consumer offsets from the old kafka broker and define them on your new cluster consumer groups.
Disclaimer: This is purely theoritical. I've never tried a migration with this sort of hard requirements.
来源:https://stackoverflow.com/questions/53667042/kafka-message-migration