问题
I have a file that looks like:
'colA'|'colB'
'word"A'|'A'
'word'B'|'B'
I want to use pd.read_csv('input.csv',sep='|', quotechar="'"
) but I get the following output:
colA colB
word"A A
wordB' B
The last row is not correct, it should be word'B B
. How do I get around this? I have tried various iterations but none of them word that reads both rows correctly. I need some csv reading expertise!
回答1:
I think you need str.strip with apply:
import pandas as pd
import io
temp=u"""'colA'|'colB'
'word"A'|'A'
'word'B'|'B'"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep='|')
df = df.apply(lambda x: x.str.strip("'"))
df.columns = df.columns.str.strip("'")
print (df)
colA colB
0 word"A A
1 word'B B
回答2:
The source of the problem is that ' is defined as quote, and as a regular char.
You can escape it e.g.
'colA'|'colB'
'word"A'|'A'
'word/'B'|'B'
And then use escapechar:
>>> pd.read_csv('input.csv',sep='|',quotechar="'",escapechar="/")
colA colB
0 word"A A
1 word'B B
Also You can use: quoting=csv.QUOTE_ALL - but the output will include the quote chars
>>> import pandas as pd
>>> import csv
>>> pd.read_csv('input.csv',sep='|',quoting=csv.QUOTE_ALL)
'colA' 'colB'
0 'word"A' 'A'
1 'word'B' 'B'
>>>
来源:https://stackoverflow.com/questions/37589795/pandas-read-csv-quotes-issue