Flink: How to handle external app configuration changes in flink

梦想与她 提交于 2019-11-29 20:57:35

问题


My requirement is to stream millions of records in a day and it has huge dependency on external configuration parameters. For example, a user can go and change the required setting anytime in the web application and after the change is made, the streaming has to happen with the new application config parameters. These are app level configurations and we also have some dynamic exclude parameters which each data has to be passed through and filtered.

I see that flink doesn’t have global state which is shared across all task managers and subtasks. Having a centralized cache is an option but for each parameter I would have to read it from cache which will increase the latency. Please advise on the better approach to handle these kind of scenarios and how other applications are handling it. Thanks.


回答1:


Updating the configuration of a running streaming application is a common requirements. In Flink's DataStream API this can be done using a so-called CoFlatMapFunction which processes two input streams. One of the streams can be a data stream and the other a control stream.

The following example shows how to dynamically adapt a user function that filters out strings that exceed a certain length.

val data: DataStream[String] = ???
val control: DataStream[Int] = ???

val filtered: DataStream[String] = data
  // broadcast all control messages to the following CoFlatMap subtasks
  .connect(control.broadcast)
  // process data and control messages
  .flatMap(new DynLengthFilter)


class DynLengthFilter extends CoFlatMapFunction[String, Int, String] with Checkpointed[Int] {

  var length = 0

  // filter strings by length
  override def flatMap1(value: String, out: Collector[String]): Unit = {
    if (value.length < length) {
      out.collect(value)
    }
  }

  // receive new filter length
  override def flatMap2(value: Int, out: Collector[String]): Unit = {
    length = value
  }

  override def snapshotState(checkpointId: Long, checkpointTimestamp: Long): Int = length

  override def restoreState(state: Int): Unit = {
    length = state
  }
}

The DynLengthFilter user function implements the Checkpointed interface for the filter length. In case of a failure, this information is automatically restored.



来源:https://stackoverflow.com/questions/39693919/flink-how-to-handle-external-app-configuration-changes-in-flink

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!