ETL & Parsing CSV files in Cloud Dataflow

前端 未结 2 1487
后悔当初
后悔当初 2021-02-06 17:49

I\'m new to cloud dataflow and Java so I\'m hoping this is the right question to ask.

I have a csv file with n number of columns and rows that could be a string, intege

2条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-02-06 18:24

    line.split(",");
    

    String.split doesn't make sense if the line data like this:

    a,b,c,"we,have a string contains comma",d,e

    A property way to deal with the csv data is to import a csv library:

            
                com.opencsv
                opencsv
                3.7
            
    

    and use codes below inside ParDo:

    public void processElement(ProcessContext c) throws IOException {
        String line = c.element();
        CSVParser csvParser = new CSVParser();
        String[] parts = csvParser.parseLine(line);
    }
    

提交回复
热议问题