how to crate the group by in pandas only in one level

后端 未结 3 1811
忘掉有多难
忘掉有多难 2021-01-16 01:04

I am importing below df3 dataframe in my excel file and want to grouby only Name and rest dublicate data should reflect as below .

Note (Each Month data will be added

相关标签:
3条回答
  • 2021-01-16 01:26

    Code

    #creating sample data as per requirement
    import pandas as pd 
    df = pd.DataFrame({'Name':['Jon','Jon','Jon','Mike','Mike','Jon','Jon'],'ID':[1,1,1,1,1,1,1], 'Month':['Feb','Jan','Mar','Jan','Jan','Feb','Jan'], 'Shift':['A','B','C','A','B','C','A']})
    #display data
    df
    

    df['Month'] = pd.Categorical(df['Month'],categories=['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'],ordered=True)
    df = df.sort_values(['Name','Month']).reset_index(drop=True)
    #display final data
    df
    

    I hope this would be helpful... : )

    0 讨论(0)
  • 2021-01-16 01:32

    You can achieve it by

    df=df.iloc[pd.to_datetime(df.Month,format='%b').argsort()]
    df=pd.concat([pd.DataFrame({'Month':[x] }).append(y).fillna('').append(pd.DataFrame(dict.fromkeys(y.columns,['']))) for x , y in df.groupby('Name')]).drop('Name',1).iloc[:-1]
    

    print(df)
    
     Month ID Shift
    0   Jon         
    1   Jan  1     B
    6   Jan  1     A
    0   Feb  1     A
    5   Feb  1     C
    2   Mar  1     C
    0               
    0  Mike         
    3   Jan  1     A
    4   Jan  1     B
    
    0 讨论(0)
  • 2021-01-16 01:40

    Heres another solution using a list comp and df.duplicated with .loc for assignment.

    import numpy as np
    df = pd.read_excel(file,sheet_name=yoursheet)
    
    #order the months. 
    
    df['Month'] = pd.Categorical(df['Month'],
                   pd.to_datetime(df['Month'],format='%b').drop_duplicates().sort_values().dt.strftime('%b'))
    
    
    
    df = df.sort_values(['Month']).reset_index(drop=True)
    
    df1 = pd.concat([data.append(data.iloc[0]) for name,data in df.groupby('Name')])
    
    df1.loc[df1.duplicated(keep='last'),1:] = ''
    
    df1['Name'] = np.where(df1['Month'].ne(''),df1['Month'],df1['Name'])
    
    final = df1.drop('Month',1)
    

       Name ID Shift
    0   Jon         
    3   Jan  1     A
    4   Feb  1     A
    5   Feb  1     C
    6   Mar  1     C
    0   Jan  1     B
    1  Mike         
    2   Jan  1     B
    1   Jan  1     A
    
    0 讨论(0)
提交回复
热议问题