Pandas column of lists, create a row for each list element

前端 未结 10 771
有刺的猬
有刺的猬 2020-11-22 06:59

I have a dataframe where some cells contain lists of multiple values. Rather than storing multiple values in a cell, I\'d like to expand the dataframe so that each item in t

10条回答
  •  一向
    一向 (楼主)
    2020-11-22 07:30

    Pandas >= 0.25

    Series and DataFrame methods define a .explode() method that explodes lists into separate rows. See the docs section on Exploding a list-like column.

    df = pd.DataFrame({
        'var1': [['a', 'b', 'c'], ['d', 'e',], [], np.nan], 
        'var2': [1, 2, 3, 4]
    })
    df
            var1  var2
    0  [a, b, c]     1
    1     [d, e]     2
    2         []     3
    3        NaN     4
    
    df.explode('var1')
    
      var1  var2
    0    a     1
    0    b     1
    0    c     1
    1    d     2
    1    e     2
    2  NaN     3  # empty list converted to NaN
    3  NaN     4  # NaN entry preserved as-is
    
    # to reset the index to be monotonically increasing...
    df.explode('var1').reset_index(drop=True)
    
      var1  var2
    0    a     1
    1    b     1
    2    c     1
    3    d     2
    4    e     2
    5  NaN     3
    6  NaN     4
    

    Note that this also handles mixed columns of lists and scalars, as well as empty lists and NaNs appropriately (this is a drawback of repeat-based solutions).

    However, you should note that explode only works on a single column (for now).

    P.S.: if you are looking to explode a column of strings, you need to split on a separator first, then use explode. See this (very much) related answer by me.

提交回复
热议问题