perform pandas aggregation whiles keeping the date column intact

后端 未结 2 1045
醉话见心
醉话见心 2021-01-25 03:40
user = {\'id\':[\'abab23\', \'abab21\', \'abab22\', \'abab25\', \'abab24\', \'abab30\', \'abab252\', \'abab15\'],
        \'dob\':[\'10-10-1990\',\'1-12-1993\', \'12-12-         


        
2条回答
  •  挽巷
    挽巷 (楼主)
    2021-01-25 04:18

    Try this:

    activities = {'sentconn':['abab35', 'abab15', 'abab25', 'abab23','abab22', 'abab15'],
                 'receiveconn': ['abab24', 'abab24', 'abab21', 'abab35', 'abab252', 'abab30'],
                  'sentdate':['2-10-2020', '2-10-2020','4-10-2020', '5-10-2020', '10-10-2020', '11-10-2020'],
                   'receivedDate':['2-10-2020', '2-10-2020','4-10-2020', '5-10-2020', '10-10-2020', '11-10-2020']}
    
    user = {'id':['abab23', 'abab21', 'abab22', 'abab25', 'abab24', 'abab30', 'abab252', 'abab15'],
            'dob':['10-10-1990','1-12-1993', '12-12-2000', '2-10-1999', '2-10-1999', '2-10-1999', '2-10-1999', '2-10-1999']}
    
    usr_df = pd.DataFrame(user)
    df = pd.DataFrame(activities)
    
    #group by the required columns to get the count.
    df1 = df.groupby(['sentdate','sentconn']).agg({'sentconn':'count'})
    df2 = df.groupby(['receivedDate','receiveconn']).agg({'receiveconn':'count'})
    
    #rename the axis so that you get common columns to concat
    df1 = df1.rename_axis(['date','user'])
    df2 = df2.rename_axis(['date','user'])
    
    df = pd.concat([df1, df2],axis=1)\
            .fillna(0)\
            .reset_index()
    #filter the user id not present is user df as required.
    df = df.loc[df['user'].isin(usr_df['id'])]\
            .set_index(['date','user'])
    print(df)
    

    outputs:

                       sentconn  receiveconn
    date       user                          
    10-10-2020 abab22        1.0          0.0
               abab252       0.0          1.0
    11-10-2020 abab15        1.0          0.0
               abab30        0.0          1.0
    2-10-2020  abab15        1.0          0.0
               abab24        0.0          2.0
    4-10-2020  abab21        0.0          1.0
               abab25        1.0          0.0
    5-10-2020  abab23        1.0          0.0
    

提交回复
热议问题