问题
i have the following excel file, and i would like to clean specific rows/columns so that i can further process the file.
I have tried this, but i have not managed to remove any of the blank lines, i ve only managed to trim from those containing data. Here, i was trying to only save the data from the third row and on.
xl = pd.ExcelFile("MRD.xlsx")
df = xl.parse("Sheet3")
df2 = df.iloc[3:]
writer4 = pd.ExcelWriter('pandas3.out.no3lines.xlsx', engine='xlsxwriter')
table5 = pd.DataFrame(df2)
table5.to_excel(writer4, sheet_name='Sheet1')
writer4.save()
I specifically want to remove rows 1, 3 (the empty ones) and the first column, so that i can pivot it. Is there a way to do this? Thank you.
回答1:
You can use drop(...)
to remove rows, and drop(..., axis=1)
to remove columns
data = [
['', '', '', ''],
['', 1, 2, 3],
['', '', '', ''],
['', 7, 8, 9],
]
import pandas as pd
df = pd.DataFrame(data)
# drop first column - [0]
df = df.drop(0, axis=1)
# drop first and third row - [0,2]
df = df.drop([0,2])
print(df)
Before:
0 1 2 3
0
1 1 2 3
2
3 7 8 9
After:
1 2 3
1 1 2 3
3 7 8 9
来源:https://stackoverflow.com/questions/47498667/python-pandas-delete-specific-rows-columns-in-excel