发表新帖

发表新帖

Python Pandas Error tokenizing data

后端未结

关注

 30  2325

I\'m trying to use pandas to manipulate a .csv file but I get this error:

pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 field

相关标签:

30条回答

小鲜肉

2020-11-22 05:39
The dataset that I used had a lot of quote marks (") used extraneous of the formatting. I was able to fix the error by including this parameter for read_csv():
```
quoting=3 # 3 correlates to csv.QUOTE_NONE for pandas
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
再見小時候

2020-11-22 05:39
I had a similar case as this and setting
```
train = pd.read_csv('input.csv' , encoding='latin1',engine='python') 
```
worked
0 讨论(0)
发布评论:

提交评论
- 加载中...

隐瞒了意图╮

2020-11-22 05:40

For those who are having similar issue with Python 3 on linux OS.

pandas.errors.ParserError: Error tokenizing data. C error: Calling
read(nbytes) on source failed. Try engine='python'.

Try:

df.read_csv('file.csv', encoding='utf8', engine='python')

0 讨论(0)

既然无缘

2020-11-22 05:40
I had a dataset with prexisting row numbers, I used index_col:
```
pd.read_csv('train.csv', index_col=0)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
夕颜

2020-11-22 05:41

following sequence of commands works (I lose the first line of the data -no header=None present-, but at least it loads):

df = pd.read_csv(filename, usecols=range(0, 42)) df.columns = ['YR', 'MO', 'DAY', 'HR', 'MIN', 'SEC', 'HUND', 'ERROR', 'RECTYPE', 'LANE', 'SPEED', 'CLASS', 'LENGTH', 'GVW', 'ESAL', 'W1', 'S1', 'W2', 'S2', 'W3', 'S3', 'W4', 'S4', 'W5', 'S5', 'W6', 'S6', 'W7', 'S7', 'W8', 'S8', 'W9', 'S9', 'W10', 'S10', 'W11', 'S11', 'W12', 'S12', 'W13', 'S13', 'W14']

Following does NOT work:

df = pd.read_csv(filename, names=['YR', 'MO', 'DAY', 'HR', 'MIN', 'SEC', 'HUND', 'ERROR', 'RECTYPE', 'LANE', 'SPEED', 'CLASS', 'LENGTH', 'GVW', 'ESAL', 'W1', 'S1', 'W2', 'S2', 'W3', 'S3', 'W4', 'S4', 'W5', 'S5', 'W6', 'S6', 'W7', 'S7', 'W8', 'S8', 'W9', 'S9', 'W10', 'S10', 'W11', 'S11', 'W12', 'S12', 'W13', 'S13', 'W14'], usecols=range(0, 42))

CParserError: Error tokenizing data. C error: Expected 53 fields in line 1605634, saw 54 Following does NOT work:

df = pd.read_csv(filename, header=None)

CParserError: Error tokenizing data. C error: Expected 53 fields in line 1605634, saw 54

Hence, in your problem you have to pass usecols=range(0, 2)

0 讨论(0)
发布评论:

提交评论
- 加载中...

再見小時候

2020-11-22 05:41

This is what I did.

sep='::' solved my issue:

data=pd.read_csv('C:\\Users\\HP\\Downloads\\NPL ASSINGMENT 2 imdb_labelled\\imdb_labelled.txt',engine='python',header=None,sep='::')

0 讨论(0)

上一页 1 2 3 4 5 下一页

热议问题