Can I rely on a in-memory Java collection in Kafka stream for buffering events by fine tuning punctuate and commit interval?

后端 未结 1 1466
小鲜肉
小鲜肉 2021-01-24 17:38

A custom processor which buffers events in a simple java.util.List in process() - this buffer is not a state store.

Every 30 seconds WALL_CLOCK_

相关标签:
1条回答
  • 2021-01-24 18:33

    I sort of figured few arguments against tuning commit and punctuate interval and calling this setup foolproof.

    From docs, on WALL_CLOCK_TIME

    This is best effort only as its granularity is limited by how long an iteration of the processing loop takes to complete

    It's possible to "miss" a punctuation if: with PunctuationType#WALL_CLOCK_TIME, on GC pause, too short interval

    Ideal :

    punctuate : |-------20s-------|-------20s-------|------20s-------|------20s------|

    c o m m it : |------------30s------------|------------30s-----------|------------30s---

    Say process() took too much time (say 18 seconds) so punctuate() was not invoked for the second run at 40th second - because as doc mentioned, too short interval.

    Now at 31st second, if the application crashes, even with eos enabled, events in buffer would have been committed at source. At restart, the buffer would be lost.

    punctuate : |-------20s-------|------process()---------20s-------|------20s------|

    c o m m it : |------------30s------------|------------30s-------------|------------30s---

    Hence it is not valid argument that tuning commit and punctuate interval would curb the need for state store.

    0 讨论(0)
提交回复
热议问题