I wrote a web scraper to pull information from a table of products and build a dataframe. The data table has a Description column which contains a comma separated string of attr
you can build up a sparse matrix:
In [27]: df
Out[27]:
PRODUCTS DATE DESCRIPTION
0 Product A 2016-9-12 Steel, Red, High Hardness
1 Product B 2016-9-11 Blue, Lightweight, Steel
2 Product C 2016-9-12 Red
In [28]: (df.set_index(['PRODUCTS','DATE'])
....: .DESCRIPTION.str.split(',\s*', expand=True)
....: .stack()
....: .reset_index()
....: .pivot_table(index=['PRODUCTS','DATE'], columns=0, fill_value=0, aggfunc='size')
....: )
Out[28]:
0 Blue High Hardness Lightweight Red Steel
PRODUCTS DATE
Product A 2016-9-12 0 1 0 1 1
Product B 2016-9-11 1 0 1 0 1
Product C 2016-9-12 0 0 0 1 0
In [29]: (df.set_index(['PRODUCTS','DATE'])
....: .DESCRIPTION.str.split(',\s*', expand=True)
....: .stack()
....: .reset_index()
....: .pivot_table(index=['PRODUCTS','DATE'], columns=0, fill_value='', aggfunc='size')
....: )
Out[29]:
0 Blue High Hardness Lightweight Red Steel
PRODUCTS DATE
Product A 2016-9-12 1 1 1
Product B 2016-9-11 1 1 1
Product C 2016-9-12 1