Have Pandas column containing lists, how to pivot unique list elements to columns?

前端 未结 5 777
旧时难觅i
旧时难觅i 2021-02-07 22:39

I wrote a web scraper to pull information from a table of products and build a dataframe. The data table has a Description column which contains a comma separated string of attr

5条回答
  •  谎友^
    谎友^ (楼主)
    2021-02-07 23:33

    you can build up a sparse matrix:

    In [27]: df
    Out[27]:
        PRODUCTS       DATE                DESCRIPTION
    0  Product A  2016-9-12  Steel, Red, High Hardness
    1  Product B  2016-9-11   Blue, Lightweight, Steel
    2  Product C  2016-9-12                        Red
    
    In [28]: (df.set_index(['PRODUCTS','DATE'])
       ....:    .DESCRIPTION.str.split(',\s*', expand=True)
       ....:    .stack()
       ....:    .reset_index()
       ....:    .pivot_table(index=['PRODUCTS','DATE'], columns=0, fill_value=0, aggfunc='size')
       ....: )
    Out[28]:
    0                    Blue  High Hardness  Lightweight  Red  Steel
    PRODUCTS  DATE
    Product A 2016-9-12     0              1            0    1      1
    Product B 2016-9-11     1              0            1    0      1
    Product C 2016-9-12     0              0            0    1      0
    
    In [29]: (df.set_index(['PRODUCTS','DATE'])
       ....:    .DESCRIPTION.str.split(',\s*', expand=True)
       ....:    .stack()
       ....:    .reset_index()
       ....:    .pivot_table(index=['PRODUCTS','DATE'], columns=0, fill_value='', aggfunc='size')
       ....: )
    Out[29]:
    0                   Blue High Hardness Lightweight Red Steel
    PRODUCTS  DATE
    Product A 2016-9-12                  1               1     1
    Product B 2016-9-11    1                         1         1
    Product C 2016-9-12                                  1
    

提交回复
热议问题