Collapsing rows in a Pandas dataframe

前端未结

关注

 2  812

梦谈多话 2021-02-09 20:27

I\'m trying to collapse rows in a dataframe that contains a column of ID data and a number of columns that each hold a different string. It looks like groupby is the solution, b

2条回答

日久生厌 (楼主)

2021-02-09 20:41

Assuming blanks are ''

option 1
pivot_table

df.pivot_table(['apples', 'pears', 'oranges'], 'ID', aggfunc=''.join)

option 2
sort and take last row as '' will be sorted first

def f(df):
    return pd.DataFrame(np.sort(df.values, 0)[[-1]], [df.name], df.columns)

df.set_index(
    'ID', append=True
).groupby(level='ID', group_keys=False).apply(f)

Both yield

     apples  oranges  pears
ID                         
101          oranges       
134  apples           pears
576          oranges  pears
837  apples

0 讨论(0)

查看其它2个回答