问题
I have a very simple problem. I'd like to take a data frame, perform a groupby on some columns, and extract the index (in the original data frame) of the first row in each group. How do I do this?
I've tried playing with as_index
, group_keys
, reset_index()
and nothing seems to work.
回答1:
You need the function first
:
x = pd.DataFrame([{'name': 'b1', 'group': 'a'},
{'name': 'b2', 'group': 'a'},
{'name': 'b3', 'group': 'a'},
{'name': 'b4', 'group': 'b'},
{'name': 'b5', 'group': 'b'},
{'name': 'b6', 'group': 'a'},
{'name': 'b7', 'group': 'c'},
{'name': 'b8', 'group': 'c'},])
x = x.reset_index() # add the indices as a column
xc = x.groupby('group').first()
print(xc)
index name
group
a 0 b1
b 3 b4
c 6 b7
来源:https://stackoverflow.com/questions/61021026/pandas-groupby-first-extract-index-from-original-dataframe