Split (explode) pandas dataframe string entry to separate rows

后端 未结 22 3503
一向
一向 2020-11-21 05:03

I have a pandas dataframe in which one column of text strings contains comma-separated values. I want to split each CSV field and create a new row per entry (as

22条回答
  •  闹比i
    闹比i (楼主)
    2020-11-21 05:13

    There is a possibility to split and explode the dataframe without changing the structure of dataframe

    Split and expand data of specific columns

    Input:

        var1    var2
    0   a,b,c   1
    1   d,e,f   2
    
    
    
    #Get the indexes which are repetative with the split 
    df['var1'] = df['var1'].str.split(',')
    df = df.explode('var1')
    

    Out:

        var1    var2
    0   a   1
    0   b   1
    0   c   1
    1   d   2
    1   e   2
    1   f   2
    

    Edit-1

    Split and Expand of rows for Multiple columns

    Filename    RGB                                             RGB_type
    0   A   [[0, 1650, 6, 39], [0, 1691, 1, 59], [50, 1402...   [r, g, b]
    1   B   [[0, 1423, 16, 38], [0, 1445, 16, 46], [0, 141...   [r, g, b]
    

    Re indexing based on the reference column and aligning the column value information with stack

    df = df.reindex(df.index.repeat(df['RGB_type'].apply(len)))
    df = df.groupby('Filename').apply(lambda x:x.apply(lambda y: pd.Series(y.iloc[0])))
    df.reset_index(drop=True).ffill()
    

    Out:

                    Filename    RGB_type    Top 1 colour    Top 1 frequency Top 2 colour    Top 2 frequency
        Filename                            
     A  0       A   r   0   1650    6   39
        1       A   g   0   1691    1   59
        2       A   b   50  1402    49  187
     B  0       B   r   0   1423    16  38
        1       B   g   0   1445    16  46
        2       B   b   0   1419    16  39
    

提交回复
热议问题