Python - Unnest cells in Pandas DataFrame

后端 未结 2 459
执念已碎
执念已碎 2021-01-12 07:14

Suppose I have DataFrame df:

a b c
v f 3|4|5
v 2 6
v f 4|5

I\'d like to produce this df:

<         


        
相关标签:
2条回答
  • 2021-01-12 07:37

    You could:

    import numpy as np
    
    df = df.set_index(['a', 'b'])
    df = df.astype(str) + '| ' # There's a space ' ' to match the replace later
    df = df.c.str.split('|', expand=True).stack().reset_index(-1, drop=True).replace(' ', np.nan).dropna().reset_index() # and replace also has a space ' '
    

    to get:

       a  b  0
    0  v  f  3
    1  v  f  4
    2  v  f  5
    3  v  2  6
    4  v  f  4
    5  v  f  5
    
    0 讨论(0)
  • 2021-01-12 07:51

    Option 1

    In [3404]: (df.set_index(['a', 'b'])['c']
                  .str.split('|', expand=True).stack()
                  .reset_index(name='c').drop('level_2', 1))
    Out[3404]:
       a  b  c
    0  v  f  3
    1  v  f  4
    2  v  f  5
    3  v  2  6
    4  v  f  4
    5  v  f  5
    

    Option 2 Using repeat and loc

    In [3503]: s = df.c.str.split('|')
    
    In [3504]: df.loc[df.index.repeat(s.str.len())].assign(c=np.concatenate(s))
    Out[3504]:
       a  b  c
    0  v  f  3
    0  v  f  4
    0  v  f  5
    1  v  2  6
    2  v  f  4
    2  v  f  5
    

    Details

    In [3505]: s
    Out[3505]:
    0    [3, 4, 5]
    1          [6]
    2       [4, 5]
    Name: c, dtype: object
    
    0 讨论(0)
提交回复
热议问题