MongoDB as datasource to Flink

冷暖自知 提交于 2020-06-11 08:43:05

问题


Can MongoDB be used as a datasource to Apache Flink for processing the Streaming Data?

What is the native implementation of Apache Flink to use No-SQL Database as data source?


回答1:


Currently, Flink does not have a dedicated connector to read from MongoDB. What you can do is the following:

  • Use StreamExecutionEnvironment.createInput and provide a Hadoop input format for MongoDB using Flink's wrapper input format
  • Implement your own MongoDB source via implementing SourceFunction/ParallelSourceFunction

The former should give you at-least-once processing guarantees since the MongoDB collection is completely re-read in case of a recovery. Depending on the functionality of the MongoDB client, you might be able to implement exactly-once processing guarantees with the latter approach.



来源:https://stackoverflow.com/questions/44153519/mongodb-as-datasource-to-flink

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!