I\'m trying to use pandas to manipulate a .csv file but I get this error:
pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 field
In my case, it is because the format of the first and last two lines of the csv file is different from the middle content of the file.
So what I do is open the csv file as a string, parse the content of the string, then use read_csv
to get a dataframe.
import io
import pandas as pd
file = open(f'{file_path}/{file_name}', 'r')
content = file.read()
# change new line character from '\r\n' to '\n'
lines = content.replace('\r', '').split('\n')
# Remove the first and last 2 lines of the file
# StringIO can be considered as a file stored in memory
df = pd.read_csv(StringIO("\n".join(lines[2:-2])), header=None)