问题
I have found a CSV parsing issue with FasterCSV (1.5.0) which seems like a genuine bug, but which I'm hoping there's a workaround for.
Basically, adding a space after the separator (in my case a comma) when the fields are enclosed in quotes generates a MalformedCSVError
.
Here's a simple example:
# No quotes on fields -- works fine
FasterCSV.parse_line("one,two,three")
=> ["one", "two", "three"]
# Quotes around fields with no spaces after separators -- works fine
FasterCSV.parse_line("\"one\",\"two\",\"three\"")
=> ["one", "two", "three"]
# Quotes around fields but with a space after the first separator -- fails!
FasterCSV.parse_line("\"one\", \"two\",\"three\"")
=> FasterCSV::MalformedCSVError: Illegal quoting on line 1.
Am I going mad, or is this a bug in FasterCSV?
回答1:
The MalformedCSVError
is correct here.
Leading/trailing spaces in CSV format are not ignored, they are considered part of a field. So this means you have started a field with a space, and then included unescaped double quotes in that field, which would cause the illegal quoting error.
Maybe this library is just more strict than others you have used.
回答2:
Maybe you could set the :col_sep: option to ', ' to make it parse files like that.
回答3:
I had hoped that the :col_sep
option might allow a regular expression, but it seems to be used for both reading and writing, which is a shame. The documentation doesn't hold out much hope and your need is probably more immediate than could be satisfied by requesting a change or submitting a patch ;-)
If you're calling #parse_line
explicitly, then you could always call
gsub(/,\s*/, ',')
on your input line. That regular expression might need to change significantly if you anticipate the possibility of comma-space within quoted strings. (I'd suggest reposting such a question here with a suitable tag and let the RegEx mavens loose on it should that be the case).
来源:https://stackoverflow.com/questions/1807942/overcoming-a-basic-problem-with-csv-parsing-using-the-fastercsv-gem