问题
After defining a time window in flink as follows:
val lines = socket.timeWindowAll(Time.seconds(5))
How can I compute the number of records in that particular window of 5 seconds?
回答1:
The most efficient way to perform a count aggregation is a ReduceFunction
. However, reduce
has the restriction that input and output type must be identical. So you would have to convert the input to an Int
before applying the window:
val socket: DataStream[(String)] = ???
val cnts: DataStream[Int] = socket
.map(_ => 1) // convert to 1
.timeWindowAll(Time.seconds(5)) // group into 5 second windows
.reduce( (x, y) => x + y) // sum 1s to count
回答2:
You could try this.May be give the solution to you.
val text = senv.socketTextStream("localhost", 9999)
val counts = text.map {(m: String) => (m.split(",")(0), 1) }
.keyBy(0)
.timeWindow(Time.seconds(10), Time.seconds(5))
.sum(1)
counts.print
senv.execute("ProcessingTime processing example")
来源:https://stackoverflow.com/questions/45606999/how-to-count-the-number-of-records-processed-by-apache-flink-in-a-given-time-win