I am using Pandas library and Python.
I have an Excel file that has some heading information on the top of an Excel sheet which I do not need for data extraction.
You could manually check for the header line and then use read_csvs keyword argument skiprows
.
with open('data.csv') as fp:
skip = next(filter(
lambda x: x[1].startswith('ID'),
enumerate(fp)
))[0]
Then skip the rows:
df = pandas.read_csv('data.csv', skiprows=skip)
Like that you can support pre-header sections of arbitrary length.
For Python 2:
import itertools as it
with open('data.csv') as fp:
skip = next(it.ifilter(
lambda x: x[1].startswith('ID'),
enumerate(fp)
))[0]
You can use pd.read_csv and specify skiprows=4
:
df = pd.read_csv('test.csv', skiprows=4)