Apache Flink add new stream dynamically

安稳与你 提交于 2019-12-11 16:56:37

问题


Is it possible in Apache Flink, to add a new datastream dynamically during runtime without restarting the Job?

As far as I understood, a usual Flink program looks like this:

val env = StreamExecutionEnvironment.getExecutionEnvironment()
val text = env.socketTextStream(hostname, port, "\n")
val windowCounts = text.map...

env.execute("Socket Window WordCount")

In my case it is possible, that e.g. a new device is started and therefore another stream must be processed. But how to add this new stream on-the-fly?


回答1:


It is not possible to add new streams at runtime to a Flink program.

The way to solve this problem is to have a stream which contains all incoming events (e.g. a Kafka topic into which you ingest all individual streams). The events should have a key identifying from which stream they come. This key can then be used to keyBy the stream and to apply a per key processing logic.

If you want to read from multiple sockets, then you could write your own SourceFunction which reads from some input (e.g. from a fixed socket) the ports to open a socket for. Then internally you could maintain all these sockets open and read in a round robin fashion from them.



来源:https://stackoverflow.com/questions/46151065/apache-flink-add-new-stream-dynamically

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!