Kafka & Flink duplicate messages on restart

馋奶兔 提交于 2019-12-04 07:20:13

(I've posted the same reply in the JIRA, just cross-posting the same here)

From your description, I'm assuming you're manually shutting down the job, and then resubmitting it, correct?

Flink does not retain exactly-once across manual job restarts, unless you use savepoints (https://ci.apache.org/projects/flink/flink-docs-master/setup/savepoints.html). The exactly-once guarantee refers to when the job fails and then automatically restores itself from previous checkpoints (when checkpointing is enabled, like what you did with env.enableCheckpointing(500) )

What is actually happening is that the Kafka consumer is simply start reading from existing offsets committed in ZK / Kafka when you manually resubmitted the job. These offsets were committed to ZK / Kafka the first time you executed the job. They however are not used for Flink's exactly-once semantics; Flink uses internally checkpointed Kafka offsets for that. The Kafka consumer commits those offsets back to ZK simply to expose a measure of progress of the job consumption to the outside world (wrt Flink).

Update 2: I fixed the bug with the offset handling, it got merged in the current MASTER.

Update: Not an issue, use manual savepoints before canceling the job (thanks to Gordon)

I checked the logs and it seems like a bug in the offset handling. I filed a report under https://issues.apache.org/jira/browse/FLINK-4618. I will update this answer when I got feedback.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!