I am importing below df3 dataframe in my excel file and want to grouby only Name and rest dublicate data should reflect as below .
Note (Each Month data will be added
Code
#creating sample data as per requirement
import pandas as pd
df = pd.DataFrame({'Name':['Jon','Jon','Jon','Mike','Mike','Jon','Jon'],'ID':[1,1,1,1,1,1,1], 'Month':['Feb','Jan','Mar','Jan','Jan','Feb','Jan'], 'Shift':['A','B','C','A','B','C','A']})
#display data
df
df['Month'] = pd.Categorical(df['Month'],categories=['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'],ordered=True)
df = df.sort_values(['Name','Month']).reset_index(drop=True)
#display final data
df
I hope this would be helpful... : )
You can achieve it by
df=df.iloc[pd.to_datetime(df.Month,format='%b').argsort()]
df=pd.concat([pd.DataFrame({'Month':[x] }).append(y).fillna('').append(pd.DataFrame(dict.fromkeys(y.columns,['']))) for x , y in df.groupby('Name')]).drop('Name',1).iloc[:-1]
print(df)
Month ID Shift
0 Jon
1 Jan 1 B
6 Jan 1 A
0 Feb 1 A
5 Feb 1 C
2 Mar 1 C
0
0 Mike
3 Jan 1 A
4 Jan 1 B
Heres another solution using a list comp and df.duplicated
with .loc
for assignment.
import numpy as np
df = pd.read_excel(file,sheet_name=yoursheet)
#order the months.
df['Month'] = pd.Categorical(df['Month'],
pd.to_datetime(df['Month'],format='%b').drop_duplicates().sort_values().dt.strftime('%b'))
df = df.sort_values(['Month']).reset_index(drop=True)
df1 = pd.concat([data.append(data.iloc[0]) for name,data in df.groupby('Name')])
df1.loc[df1.duplicated(keep='last'),1:] = ''
df1['Name'] = np.where(df1['Month'].ne(''),df1['Month'],df1['Name'])
final = df1.drop('Month',1)
Name ID Shift
0 Jon
3 Jan 1 A
4 Feb 1 A
5 Feb 1 C
6 Mar 1 C
0 Jan 1 B
1 Mike
2 Jan 1 B
1 Jan 1 A