Pandas Read_CSV quotes issue

北城余情 提交于 2019-12-19 07:51:12

问题


I have a file that looks like:

'colA'|'colB'
'word"A'|'A'
'word'B'|'B'

I want to use pd.read_csv('input.csv',sep='|', quotechar="'") but I get the following output:

colA    colB
word"A   A
wordB'   B

The last row is not correct, it should be word'B B. How do I get around this? I have tried various iterations but none of them word that reads both rows correctly. I need some csv reading expertise!


回答1:


I think you need str.strip with apply:

import pandas as pd
import io

temp=u"""'colA'|'colB'
'word"A'|'A'
'word'B'|'B'"""

#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep='|')

df = df.apply(lambda x: x.str.strip("'"))
df.columns = df.columns.str.strip("'")
print (df)
     colA colB
0  word"A    A
1  word'B    B



回答2:


The source of the problem is that ' is defined as quote, and as a regular char.

You can escape it e.g.

'colA'|'colB'
'word"A'|'A'
'word/'B'|'B'

And then use escapechar:

>>> pd.read_csv('input.csv',sep='|',quotechar="'",escapechar="/")
     colA colB
0  word"A    A
1  word'B    B

Also You can use: quoting=csv.QUOTE_ALL - but the output will include the quote chars

>>> import pandas as pd
>>> import csv
>>> pd.read_csv('input.csv',sep='|',quoting=csv.QUOTE_ALL)
     'colA' 'colB'
0  'word"A'    'A'
1  'word'B'    'B'
>>>


来源:https://stackoverflow.com/questions/37589795/pandas-read-csv-quotes-issue

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!