Flink CsvTableSource Streaming

允我心安 提交于 2019-12-07 09:21:18

问题


I want to stream a csv file and perform sql operations using flink. But the code i have written just reads once and stops. It does not stream. Thanks in advance,

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

StreamTableEnvironment tableEnv = StreamTableEnvironment.getTableEnvironment(env);

CsvTableSource csvtable = CsvTableSource.builder()
    .path("D:/employee.csv")
    .ignoreFirstLine()
    .fieldDelimiter(",")
    .field("id", Types.INT())
    .field("name", Types.STRING())
    .field("designation", Types.STRING())
    .field("age", Types.INT())
    .field("location", Types.STRING())
    .build();

tableEnv.registerTableSource("employee", csvtable);

Table table = tableEnv.scan("employee").where("name='jay'").select("id,name,location");
//Table table1 = tableEnv.scan("employee").where("age > 23").select("id,name,age,location");

DataStream<Row> stream = tableEnv.toAppendStream(table, Row.class);

//DataStream<Row> stream1 = tableEnv.toAppendStream(table1, Row.class);

stream.print();
//stream1.print();

env.execute();

回答1:


The CsvTableSource is based on a FileInputFormat which reads and parses the referenced file line by line. The resulting rows are forwarded into the streaming query. So in CsvTableSource is streaming in the sense that rows are continuously read and forwarded. However, the CsvTableSource terminates at the end of the file. Hence, it emits a bounded stream.

I assume the behavior that you expect is that the CsvTableSource reads the file until its end and then waits for appending writes to the file. However, this is not how the CsvTableSource works. You would need to implement a custom TableSource for that.



来源:https://stackoverflow.com/questions/45295749/flink-csvtablesource-streaming

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!