Say I have the following Excel file:
A B C
0 - - -
1 Start - -
2 3 2 4
3 7 8 4
4 11 2 17
You could use pd.read_excel('C:\Users\MyFolder\MyFile.xlsx', sheetname='Sheet1')
as it ignores empty excel cells.
Your DataFrame should then look like this:
A B C
0 Start NaN NaN
1 3 2 4
2 7 8 4
3 11 2 17
Then drop the first row by using
df.drop([0])
to get
A B C
0 3 2 4
1 7 8 4
2 11 2 17
If you know the specific rows you are interested in, you can skip from the top using skiprow
and then parse only the row (or rows) you want using nrows
- see pandas.read_excel
df = pd.read_excel('myfile.xlsx', 'Sheet1', skiprows=2, nrows=3,)
df = pd.read_excel('your/path/filename')
This answer helps in finding the location of 'start' in the df
for row in range(df.shape[0]):
for col in range(df.shape[1]):
if df.iat[row,col] == 'start':
row_start = row
break
after having row_start you can use subframe of pandas
df_required = df.loc[row_start:]
And if you don't need the row containing 'start', just u increment row_start by 1
df_required = df.loc[row_start+1:]