pandas.errors.ParserError: ',' expected after '"'

前端未结

关注

 2  817

逝去的感伤

I am trying to read this dataset from Kaggle: Amazon sales rank data for print and kindle books

The file amazon_com_extras.csv has a column named \"Title\"

相关标签:

2条回答

隐瞒了意图╮

2021-01-22 03:57

This works for me Sniffer:

import requests
import csv
with open('spotify_dataset.csv') as csvfile:
    dialect = csv.Sniffer().sniff(csvfile.read(14734))


df = pd.read_csv('spotify_dataset.csv', engine='python', dialect=dialect, error_bad_lines=False)

0 讨论(0)

感情败类

2021-01-22 04:10
This is happening to you because there are fields inside the document that contain unescaped quotes inside the quoted text.

I am not aware of a way to instruct the csv parser to handle that without preprocessing.

If you don't care about those columns, you can use
```
pd.read_csv("amazon_com_extras.csv", engine="python", sep=',', quotechar='"', error_bad_lines=False)
```
That will disable the Exception from being raised, but it will remove the affected lines (you will see that in the console).

An example of such a line:
```
"1405246510","book","hardcover",""Hannah Montana" Annual 2010","Unknown","Egmont Books Ltd"
```
Notice the quotes.

Instead, a more standard dialect of csv would have rendered:
```
1405246510,"book","hardcover","""Hannah Montana"" Annual 2010","Unknown","Egmont Books Ltd"
```
You can, for example, load the file with Libreoffice and re-save it as CSV again to get a working CSV dialect or use other preprocessing techniques.
0 讨论(0)
发布评论:

提交评论
- 加载中...