Ruby unable to parse a CSV file: CSV::MalformedCSVError (Illegal quoting in line 1.)

后端 未结 9 2164
被撕碎了的回忆
被撕碎了的回忆 2021-02-03 22:17

Ubuntu 12.04 LTS

Ruby ruby 1.9.3dev (2011-09-23 revision 33323) [i686-linux]

Rails 3.2.9

Following is

相关标签:
9条回答
  • 2021-02-03 22:57

    Try this hint:

    1. Open your CSV file in a text editor
    2. Select the whole file and copy it
    3. Open a new text file
    4. Paste the CSV data into the new file and Save the new file
    5. Import your new CSV file
    0 讨论(0)
  • 2021-02-03 23:03

    I attempted to read the file and get a string and then parse thes tring into a CSV table, but received an exception:

    CSV.read(File.read('file.csv'), headers: true)
    CSV::MalformedCSVError: Unclosed quoted field on line 1794.
    

    None of the answers provided here worked for me. In fact, the one with highest votes was taking so long to parse that eventually I terminated the execution. It most likely was raising many exceptions, and that time is costly on a large file.

    Even more problematic, the error is not so helpful, since it is a large CSV file. Where exactly is line 1794? I opened up the file in LibreOffice which opened without any problems. Line 1794 was the last row of data of the csv file. So apparently the problem had to do with the end of the CSV file. I decided to inspect the contents as a string with File.read. I noticed the string ended with a carriage return character:

    ,\"\"\r
    

    I decided to use chomp and remove the carriage return at the end of file. Note if $/ has not been changed from the default Ruby record separator, then chomp also removes carriage return characters (that is it will remove \n, \r, and \r\n).

    CSV.parse(File.read('file.csv' ).chomp, headers: true)
     => #<CSV::Table mode:col_or_row row_count:1794>
    

    And it worked. The problem was the \r character at the end of the file.

    0 讨论(0)
  • 2021-02-03 23:05

    I just had an issue like this and discovered that CSV does not like spaces between the col-sep and the quote character. Once I removed those everything went fine. So I had:

    12,  "N",  12, "Pacific/Majuro"
    

    but once I gsubed out the spaces using

    .gsub(/,\s+\"/,',\"')
    

    resulting in

    12,"N",  12,"Pacific/Majuro"
    

    everything went fine.

    0 讨论(0)
  • 2021-02-03 23:07

    Rails 6 version, ruby 2.4+

    CSV.foreach(file, liberal_parsing: true, headers: :first_row) do |row|
        // do whatever
    end
    

    https://ruby-doc.org/stdlib-2.4.0/libdoc/csv/rdoc/CSV.html

    0 讨论(0)
  • 2021-02-03 23:08

    Add the :liberal_parsing => true argument to CSV.read and this should solve some of the issues with "illegal quoting"

    0 讨论(0)
  • 2021-02-03 23:10

    Anand, thank you for the encoding suggestion. This solved the illegal quoting problem for me.

    Note: If you want the iterator to skip over the header row add headers: :first_row, like so:

    CSV.foreach("test.csv", encoding: "bom|utf-8", headers: :first_row)
    
    0 讨论(0)
提交回复
热议问题