问题
I have a Flink Table with the following columns: final String[] hNames = {"mID", "dateTime", "mValue", "unixDateTime", "mType"};
I want to create a DataStream in Apache Flink that makes tumbling windows of a length of 2 each and calculates the average mValue
for that window. Below I've used the SUM
function since it seems there isnt a AVG
function. These windows must be grouped on the mID
(is a Integer) or dateTime
column. I key the windows by the column mType
, since these represent a specific group of data.
Another issue that I have is that the data I use in this app is from a CSV file. So its not real time data. The problem is that Flink randomly order this data. I want it to be sorted ascending on the mID or dateTime column.
The code I have below does not print anything. What am I doing wrong here? The weird thing is when I replace countWindow()
function with countWindowAll()
then I do get output.
final String[] hColumnNames = {"mID", "dateTime", "mValue", "unixDateTime", "mType"};
StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(fsEnv);
TableSource csvSource = CsvTableSource.builder()
.path("path")
.fieldDelimiter(";")
.field(hColumnNames[0], Types.INT())
.field(hColumnNames[1], Types.SQL_TIMESTAMP())
.field(hColumnNames[2], Types.DOUBLE())
.field(hColumnNames[3], Types.LONG())
.field(hColumnNames[4], Types.STRING())
.build();
//Register the TableSource
tableEnv.registerTableSource("H", csvSource);
Table HTable = tableEnv.scan("H");
tableEnv.registerTable("HTable", HTable);
DataStream<Row> stream = tableEnv.toAppendStream(HTable, Row.class);
TupleTypeInfo<Tuple5<Integer, Timestamp, Double, Long, String>> tupleType = new TupleTypeInfo<>(
Types.INT(),
Types.SQL_TIMESTAMP(),
Types.DOUBLE(),
Types.LONG(),
Types.STRING());
DataStream<Tuple5<Integer, Timestamp, Double, Long, String>> dsTuple =
tableEnv.toAppendStream(HTable, tupleType);
//What is going wrong below???
DataStream<Tuple5<Integer, Timestamp, Double, Long, String>> dsTuple1 = dsTuple
.keyBy(4)
.countWindow(2)
.sum(3)
;
try {
fsEnv.execute();
} catch (Exception e) {
e.printStackTrace();
}
来源:https://stackoverflow.com/questions/58487582/convert-apache-flink-datastream-to-a-datastream-that-makes-tumbling-windows-of-2