Collapsing rows in a Pandas dataframe

前端 未结 2 812
梦谈多话
梦谈多话 2021-02-09 20:27

I\'m trying to collapse rows in a dataframe that contains a column of ID data and a number of columns that each hold a different string. It looks like groupby is the solution, b

2条回答
  •  日久生厌
    2021-02-09 20:41

    Assuming blanks are ''

    option 1
    pivot_table

    df.pivot_table(['apples', 'pears', 'oranges'], 'ID', aggfunc=''.join)
    

    option 2
    sort and take last row as '' will be sorted first

    def f(df):
        return pd.DataFrame(np.sort(df.values, 0)[[-1]], [df.name], df.columns)
    
    df.set_index(
        'ID', append=True
    ).groupby(level='ID', group_keys=False).apply(f)
    

    Both yield

         apples  oranges  pears
    ID                         
    101          oranges       
    134  apples           pears
    576          oranges  pears
    837  apples                
    

提交回复
热议问题