Kafka Streams - Processor context commit

元气小坏坏 提交于 2020-12-08 06:28:42

问题


should we ever invoke processorContext.commit() in Processor implementation by ourselves? I mean invoking commit method inside scheduled Punctuator implementation or inside process method.

in which use cases should we do that, and do we need that at all? the question relates to both Kafka DSL with transform() and Processor API.

seems Kafka Streams handles it by itself, also invoking processorContext.commit() does not guarantee that it will be done immediately.


回答1:


It is ok to call commit() -- either from the Processor or from a Punctuation -- that's why this API is offered.

While Kafka Streams commits on a regular (configurable) interval, you can request intermediate commits when you use it. One example use case would be, that you usually do cheap computation, but sometimes you do something expensive and want to commit asap after this operation instead of waiting for the next commit interval (to reduce the likelihood of a failure after the expensive operation and the next commit interval). Another use case would be, if you set the commit interval to MAX_VALUE what effectively "disables" regular commits and to decide when to commit base on your business logic.

I guess, that calling commit() is not necessary for most use cases thought.




回答2:


For the use case I am batching certain number of record in processor process method and writing the batched records to File from process function if the batch size reaches like certain number(lets say 10).

Lets say we write one batch of records to file and system crashes at the point before commit happens (Since we cann't call explicit commits). Next time the stream starts and processor processes the records from the last committed offset. This means we could be writing some duplicate data to files. Is there anyway to avoid writing duplicate data??



来源:https://stackoverflow.com/questions/54075610/kafka-streams-processor-context-commit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!