Splitting a List inside a Pandas DataFrame

后端 未结 4 1788
南方客
南方客 2021-02-02 01:09

I have a csv file that contains a number of columns. Using pandas, I read this csv file into a dataframe and have a datetime index and five or six other columns.

One of

4条回答
  •  遥遥无期
    2021-02-02 01:47

    The way I did it was split the list into seperate columns, and then melted it to put each timestamp in a separate row.

    In [48]: df = pd.DataFrame([[1,2,[1,2,4]],[4,5,[1,3]],],columns=['a','b','TimeStamp'])
        ...: df
    Out[48]: 
       a  b  TimeStamp
    0  1  2  [1, 2, 4]
    1  4  5     [1, 3]
    

    You can convert the column to a list and then back to a DataFrame to split it into columns:

    In [53]: TScolumns = pd.DataFrame(df.TimeStamp.tolist(), )
        ...: TScolumns
    Out[53]: 
       0  1   2
    0  1  2   4
    1  1  3 NaN
    

    And then splice it onto the original dataframe

    In [90]: df = df.drop('TimeStamp',axis=1)
    In [58]: split = pd.concat([df, TScolumns], axis=1)
        ...: split
    Out[58]: 
       a  b  0  1   2
    0  1  2  1  2   4
    1  4  5  1  3 NaN
    

    Finally, use melt to get it into the shape you want:

    In [89]: pd.melt(split, id_vars=['a', 'b'], value_name='TimeStamp')
    Out[89]: 
       a  b variable  TimeStamp
    0  1  2        0          1
    1  4  5        0          1
    2  1  2        1          2
    3  4  5        1          3
    4  1  2        2          4
    5  4  5        2        NaN
    

提交回复
热议问题