问题
I am looking for a solution how I can change a source function in Flink while execution is in progress:
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
SourceFunction<String> mySource = ...; // this a function that I want to change during runtime;
DataStream<String> stream = env.addSource(mySource);
stream.map(...).print(); // creating my stream
env.execute("sample");
I am thinking about creating a wrapper around a real implementation of SourceFunction
that will replace the implementation behind the scenes when needed but come across a notion of SourceContext
.
回答1:
There was a talk at Flink Forward that looked at some related issues. I think you'd find it helpful. See Bootstrapping State In Apache Flink.
回答2:
You could connect the stream from the two source functions, and run them into a CoMapFunction
. Inside of that you can decide which to discard, but that assumes the later source isn't outputting data until you're ready to switch to it.
回答3:
OK, as an alternative you can look at an answer I'd provided previously on SO, with some sample code to wrap multiple sources. But note Fabian's comment that this will only preserve order if the downstream operator's parallelism is also 1.
来源:https://stackoverflow.com/questions/50027128/change-source-function-in-flink-without-interrupting-the-execution