Invalid char between encapsulated token and delimiter in Apache Commons CSV library

后端 未结 5 1640
有刺的猬
有刺的猬 2021-01-03 20:13

I am getting the following error while parsing the CSV file using the Apache Commons CSV library.

Exception in thread \"main\" java.io.IOException: (line 2)          


        
相关标签:
5条回答
  • 2021-01-03 20:30

    We ran into this issue when we had embedded quote in our data.

    0,"020"1,"BS:5252525  ORDER:99999"4
    

    Solution applied was CSVFormat csvFileFormat = CSVFormat.DEFAULT.withQuote(null);

    @Cuga tip helped us to resolve. Thanks @Cuga

    Full code is

        public static void main(String[] args) throws IOException {
        FileReader fileReader = null;
        CSVFormat csvFileFormat = CSVFormat.DEFAULT.withQuote(null);
        String fileName = "test.csv";
    
        fileReader = new FileReader(fileName);
        CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);
    
        List<CSVRecord> csvRecords = csvFileParser.getRecords();
    
        for (CSVRecord csvRecord : csvRecords) {
            System.out.println(csvRecord);
        }
        csvFileParser.close();
    }
    

    Result is

    CSVRecord [comment=null, mapping=null, recordNumber=1, values=[0, "020"1, "BS:5252525  ORDER:99999"4]]
    
    0 讨论(0)
  • 2021-01-03 20:34

    I found the solution to the problem. One of my CSV file has an attribute as follows: "attribute with nested "quote" "

    Due to nested quote in the attribute the parser fails.

    To avoid the above problem escape the nested quote as follows: "attribute with nested """"quote"""" "

    This is the one way to solve the problem.

    0 讨论(0)
  • 2021-01-03 20:43

    I ran into this issue when I forgot to call .withNullString("") on my CSVFormat. Basically, this exception always occurs when:

    • your quote symbol is wrong
    • your null string representation is wrong
    • your column separator char is wrong

    Make sure you know the details of your format. Also, some programs use leading byte-order-marks (for example, Excel uses \uFEFF) to denote the encoding of the file. This can also trip up your parser.

    0 讨论(0)
  • 2021-01-03 20:50

    That line in the CSV file contains an invalid character between one of your cells and either the end of line, end of file, or the next cell. A very common cause for this is a failure to escape your encapsulating character (the character that is used to "wrap" each cell, so CSV knows where a cell (token) starts and ends.

    0 讨论(0)
  • 2021-01-03 20:56

    We ran into this in this same error with data containing quotes in otherwise unquoted input. I.e.:

    some cell|this "cell" caused issues|other data
    

    It was hard to find, but in Apache's docs, they mention the withQuote() method which can take null as a value.

    We were getting the exact same error message and this (thankfully) ended up fixing the issue for us.

    0 讨论(0)
提交回复
热议问题