问题
I want to stream a csv file and perform sql operations using flink. But the code i have written just reads once and stops. It does not stream. Thanks in advance,
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tableEnv = StreamTableEnvironment.getTableEnvironment(env);
CsvTableSource csvtable = CsvTableSource.builder()
.path("D:/employee.csv")
.ignoreFirstLine()
.fieldDelimiter(",")
.field("id", Types.INT())
.field("name", Types.STRING())
.field("designation", Types.STRING())
.field("age", Types.INT())
.field("location", Types.STRING())
.build();
tableEnv.registerTableSource("employee", csvtable);
Table table = tableEnv.scan("employee").where("name='jay'").select("id,name,location");
//Table table1 = tableEnv.scan("employee").where("age > 23").select("id,name,age,location");
DataStream<Row> stream = tableEnv.toAppendStream(table, Row.class);
//DataStream<Row> stream1 = tableEnv.toAppendStream(table1, Row.class);
stream.print();
//stream1.print();
env.execute();
回答1:
The CsvTableSource
is based on a FileInputFormat
which reads and parses the referenced file line by line. The resulting rows are forwarded into the streaming query. So in CsvTableSource
is streaming in the sense that rows are continuously read and forwarded. However, the CsvTableSource
terminates at the end of the file. Hence, it emits a bounded stream.
I assume the behavior that you expect is that the CsvTableSource
reads the file until its end and then waits for appending writes to the file.
However, this is not how the CsvTableSource
works. You would need to implement a custom TableSource
for that.
来源:https://stackoverflow.com/questions/45295749/flink-csvtablesource-streaming