Python CSV module - quotes go missing

一笑奈何 提交于 2019-12-21 17:22:06

问题


I have a CSV file that has data like this

15,"I",2,41301888,"BYRNESS RAW","","BYRNESS VILLAGE","NORTHUMBERLAND","ENG"
11,"I",3,41350101,2,2935,2,2008-01-09,1,8,0,2003-02-01,,2009-12-22,2003-02-11,377016.00,601912.00,377105.00,602354.00,10

I am reading this and then writing different rows to different CSV files.

However, in the original data there are quotes around the non-numeric fields, as some of them contain commas within the field.

I am not able to keep the quotes.

I have researched lots and discovered the quoting=csv.QUOTE_NONNUMERIC however this now results in a quote mark around every field and I dont know why??

If i try one of the other quoting options like MINIMAL I end up with an error message regarding the date value, 2008-01-09, not being a float.

I have tried to create a dialect, add the quoting on the csv reader and writer but nothing I have tried results in the getting an exact match to the original data.

Anyone had this same problem and found a solution.


回答1:


When writing, quoting=csv.QUOTE_NONNUMERIC keeps values unquoted as long as they're numbers, ie. if their type is int or float (for example), which means it will write what you expect.

Your problem could be that, when reading, a csv.reader will turn every row it reads into a list of strings (if you read the documentation carefully enough, you'll see a reader does not perform automatic data type conversion!

If you don't perform any kind of conversion after reading, then when you write you'll end up with everything on quotes... because everything you write is a string.

Edit: of course, date fields will be quoted, because they are not numbers, meaning you cannot get the exact expected behaviour using the standard csv.writer.




回答2:


Are you sure you have a problem? The behavior you're describing is correct: The csv module will enclose strings in quotes only if it's necessary for parsing them correctly. So you should expect to see quotes only around strings containing a comma, newlines, etc. Unless you're getting errors reading your output back in, there is no problem.




回答3:


Trying to get an "exact match" of the original data is a difficult and potentially fruitless endeavor. quoting=csv.QUOTE_NONNUMERIC put quotes around everything because every field was a string when you read it in.

Your concern that some of the "quoted" input fields could have commas is usually not that big a deal. If you added a comma to one of your quoted fields and used the default writer, the field with the comma would be automatically quoted in the output.



来源:https://stackoverflow.com/questions/9353792/python-csv-module-quotes-go-missing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!