问题
Here is the content of a csv file 'test.csv', i am trying to read it via pandas read_csv()
"col1", "col2", "col3", "col4"
"v1", "v2", "v3", "v4"
"v21", "v22", "v23", "this, "creating, what to do? " problems"
This is the command i am using -
messages = pd.read_csv('test.csv', sep=',', skipinitialspace=True)
But i am getting the following error -
CParserError: Error tokenizing data. C error: Expected 4 fields in line 3, saw 5
i want the content for column4 in line3 to be 'this, "creating, what to do? " problems'
How to read file when a column can have quotechar and delimiter included in it ?
回答1:
pandas does not allow you to keep malformed rows and to be honest I don't really see a way of ignoring some "
characters but not others in your example. I think your intuition of using '", "'
as the delimiter and then doing a cleanup is the best approach. If you're really worried about doing this in one line:
message = pd.read_csv('test.txt', sep='", "', names = ['col1','col2','col3','col4'], skiprows=1).apply(lambda x: x.str.strip('"'))
which handles stripping quotes in the column names as well and gives you:
>>> message
>>>
col1 col2 col3 col4
0 v1 v2 v3 v4
1 v21 v22 v23 this, "creating, what to do? " problems
来源:https://stackoverflow.com/questions/35686920/reading-csv-from-pandas-having-both-quotechar-and-delimiter-for-a-column-value