Applying Pandas melt() on a dataframe with multiple variable columns

后端 未结 2 1296
时光取名叫无心
时光取名叫无心 2021-01-27 09:56

I have a dataframe. Rows are unique persons and columns are various action types taken. I need the data restructured to show the individual events by row. Here is my current and

相关标签:
2条回答
  • 2021-01-27 10:29

    Use df.melt (v0.20+):

    df
         action a    action b    action c   name
    0  2017-10-04  2017-10-05  2017-10-06   ross
    1  2017-10-04  2017-10-05  2017-10-06  allen
    2  2017-10-04  2017-10-05  2017-10-06    jon
    
    df = df.melt('name').sort_values('name')
    df.columns = ['name', 'action', 'date']
    df
        name    action        date
    1  allen  action a  2017-10-04
    4  allen  action b  2017-10-05
    7  allen  action c  2017-10-06
    2    jon  action a  2017-10-04
    5    jon  action b  2017-10-05
    8    jon  action c  2017-10-06
    0   ross  action a  2017-10-04
    3   ross  action b  2017-10-05
    6   ross  action c  2017-10-06
    
    0 讨论(0)
  • r = df.roles
    c = df.roles.str.count(',') + 1
    i = df.index
    df.loc[i.repeat(c)].assign(roles=','.join(r).split(','))
    
      company  employer_id                roles
    0       a            1             engineer
    0       a            1       data_scientist
    0       a            1            architect
    1       b            2             engineer
    1       b            2  front_end_developer
    
    0 讨论(0)
提交回复
热议问题