Ruby/Rails CSV parsing, invalid byte sequence in UTF-8

前端 未结 7 1681
别跟我提以往
别跟我提以往 2020-12-24 01:02

I am trying to parse a CSV file generated from an Excel spreadsheet.

Here is my code

require \'csv\'
file = File.open(\"input_file\")
csv = CSV.parse         


        
相关标签:
7条回答
  • 2020-12-24 01:21

    I had this same problem and was just using google spreadsheets and then downloading as a CSV. That was the easiest solution.

    Then I came across this gem

    https://github.com/singlebrook/utf8-cleaner

    Now I don't need to worry about this issue at all. Hope this helps!

    0 讨论(0)
  • 2020-12-24 01:22

    Specify the encoding with encoding option:

    CSV.foreach(file.path, headers: true, encoding:'iso-8859-1:utf-8') do |row|
      ...
    end
    
    0 讨论(0)
  • 2020-12-24 01:33

    add second argument "r:ISO-8859-1" as File.open("input_file","r:ISO-8859-1" )

    0 讨论(0)
  • 2020-12-24 01:35

    You need to tell Ruby that the file is in ISO-8859-1. Change your file open line to this:

    file=File.open("input_file", "r:ISO-8859-1")
    

    The second argument tells Ruby to open read only with the encoding ISO-8859-1.

    0 讨论(0)
  • 2020-12-24 01:38

    If you have only one (or few) file, so when its not needed to automatically declare encoding on whatever file you get from input, and you have the contents of this file visible in plaintext (txt, csv etc) separated with i.e. semicolon, you can create new file with .csv extension manually, and paste the contents of your file there, then parse the contents like usual.

    Keep in mind, that this is a workaround, but in need of parsing in linux only one big excel file, converted to some flavour of csv, it spares time on experimenting with all those fancy encodings

    0 讨论(0)
  • 2020-12-24 01:45

    Save the file in utf-8, unless for some reason you need to save it differently in which case you may specify the encoded set while reading the file

    0 讨论(0)
提交回复
热议问题