Kafka as a data store for future events

只谈情不闲聊 提交于 2019-12-24 16:38:56

问题


I have a Kafka cluster which receives messages from a source based on data changes in that source. In some cases the messages are meant to be processed in the future. So I have 2 options:

  1. Consume all messages and post messages that are meant for the future back to Kafka under a different topic (with the date in the topic name) and have a Storm topology that looks for topics with that date's name in it. This will ensure that messages are processed only on the day it's meant for.
  2. Store it in a separate DB and build a scheduler that reads messages and posts to Kafka only on that future date.

Option 1 is easier to execute but my question is: Is Kafka a durable data store? And has anyone done this sort of eventing with Kafka? Are there any gaping holes in the design?


回答1:


You can configure the amount of time your messages stay in Kafka (log.retention.hours).

But keep in mind that Kafka is meant to be used as a "real-time buffer" between your producers and your consumers, not as durable data store. I don't think Kafka+Storm would be the appropriate tool for your use case. Why not just write your messages in some distributed file system, and schedule a job (MapReduce, Spark...) to process those events?



来源:https://stackoverflow.com/questions/28105864/kafka-as-a-data-store-for-future-events

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!