I\'m processing lots (thousands) of ~100k line csv files that are produced by someone else. 9 times out of 10 the files have 8 columns and all is right with the world. The 10th
If you want to drop the bad lines, you might be able to use error_bad_lines=False
(and warn_bad_lines = False
if you want it to be quiet about it):
>>> !cat unclean.csv
A,B,C,D,E,F,G,H
A,B,C,D,E,F,G,H
A,B,C,D,E,F,Foo,Bar,G,H
A,B,C,D,E,F,G,H
A,B,C,D,E,F,Foo,Bar,G,H
A,B,C,D,E,F,G,H
A,B,C,D,E,F,G,H
>>> df = pd.read_csv("unclean.csv", error_bad_lines=False, header=None)
Skipping line 3: expected 8 fields, saw 10
Skipping line 5: expected 8 fields, saw 10
>>> df
0 1 2 3 4 5 6 7
0 A B C D E F G H
1 A B C D E F G H
2 A B C D E F G H
3 A B C D E F G H
4 A B C D E F G H