How to count the number of records processed by Apache Flink in a given time window

僤鯓⒐⒋嵵緔 提交于 2021-01-01 04:29:58

问题


After defining a time window in flink as follows:

val lines = socket.timeWindowAll(Time.seconds(5))

How can I compute the number of records in that particular window of 5 seconds?


回答1:


The most efficient way to perform a count aggregation is a ReduceFunction. However, reduce has the restriction that input and output type must be identical. So you would have to convert the input to an Int before applying the window:

val socket: DataStream[(String)] = ???

val cnts: DataStream[Int] = socket
  .map(_ => 1)                    // convert to 1
  .timeWindowAll(Time.seconds(5)) // group into 5 second windows
  .reduce( (x, y) => x + y)       // sum 1s to count



回答2:


You could try this.May be give the solution to you.

val text = senv.socketTextStream("localhost", 9999)
val counts = text.map {(m: String) => (m.split(",")(0), 1) }
    .keyBy(0)
    .timeWindow(Time.seconds(10), Time.seconds(5))
    .sum(1)
counts.print
senv.execute("ProcessingTime processing example")


来源:https://stackoverflow.com/questions/45606999/how-to-count-the-number-of-records-processed-by-apache-flink-in-a-given-time-win

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!