Pandas: how to designate starting row to extract data

后端未结

关注

 2  1234

I am using Pandas library and Python.

I have an Excel file that has some heading information on the top of an Excel sheet which I do not need for data extraction.

相关标签:

2条回答

清歌不尽

2021-01-21 02:37

You could manually check for the header line and then use read_csvs keyword argument skiprows.

with open('data.csv') as fp:
    skip = next(filter(
        lambda x: x[1].startswith('ID'),
        enumerate(fp)
    ))[0]

Then skip the rows:

df = pandas.read_csv('data.csv', skiprows=skip)

Like that you can support pre-header sections of arbitrary length.

For Python 2:

import itertools as it

with open('data.csv') as fp:
    skip = next(it.ifilter(
        lambda x: x[1].startswith('ID'),
        enumerate(fp)
    ))[0]

0 讨论(0)

Happy的楠姐

2021-01-21 02:55
You can use pd.read_csv and specify skiprows=4:
```
df = pd.read_csv('test.csv', skiprows=4)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...