perform pandas aggregation whiles keeping the date column intact

后端 未结 2 1048
醉话见心
醉话见心 2021-01-25 03:40
user = {\'id\':[\'abab23\', \'abab21\', \'abab22\', \'abab25\', \'abab24\', \'abab30\', \'abab252\', \'abab15\'],
        \'dob\':[\'10-10-1990\',\'1-12-1993\', \'12-12-         


        
2条回答
  •  爱一瞬间的悲伤
    2021-01-25 04:44

    Use:

    #seelct only necessary columns
    activities = activities[['sentconn','receiveconj','sentdate','receivedDate']]
    
    #set new columns names
    activities.columns = ['sent_id','receive_id','sent_date','receive_date']
    
    #ssplit columns names by _ to MultiIndex
    activities.columns = activities.columns.str.split('_', expand=True)
    
    #reshape DataFrame and filter by is with id in inner merge
    activities = (activities.stack(0)
                            .rename_axis([None, 'type'])
                            .reset_index(level=1)
                            .merge(user['id']))
    print (activities)
          type        date       id
    0  receive   2-10-2020   abab24
    1  receive   2-10-2020   abab24
    2     sent   2-10-2020   abab15
    3     sent  11-10-2020   abab15
    4  receive   4-10-2020   abab21
    5     sent   4-10-2020   abab25
    6     sent   5-10-2020   abab23
    7  receive  10-10-2020  abab252
    8     sent  10-10-2020   abab22
    9  receive  11-10-2020   abab30
    

    #get counts by crosstab
    df = pd.crosstab([activities['date'], activities['id']], activities['type'])
    print (df)
    type                receive  sent
    date       id                    
    10-10-2020 abab22         0     1
               abab252        1     0
    11-10-2020 abab15         0     1
               abab30         1     0
    2-10-2020  abab15         0     1
               abab24         2     0
    4-10-2020  abab21         1     0
               abab25         0     1
    5-10-2020  abab23         0     1
    

提交回复
热议问题