I wrote a web scraper to pull information from a table of products and build a dataframe. The data table has a Description column which contains a comma separated string of attr
How about something that places an 'X' in the feature column if the product has that feature.
The below creates a list of unique features ('Steel', 'Red', etc.), then creates a column for each feature in the original df. Then we iterate through each row and for each product feature, we place an 'X' in the cell.
ml = []
a = [ml.append(item) for l in df.DESCRIPTION for item in l]
unique_list_of_attributes = list(set(ml)) # unique features list
# place empty columns in original df for each feature
df = pd.concat([df,pd.DataFrame(columns=unique_list_of_attributes)]).fillna(value='')
# add 'X' in column if product has feature
for row in df.iterrows():
for attribute in row[1]['DESCRIPTION']:
df.loc[row[0],attribute] = 'X'
updated with example output:
PRODUCTS DATE DESCRIPTION Blue HighHardness \
0 Product A 2016-9-12 [Steel, Red, HighHardness] X
1 Product B 2016-9-11 [Blue, Lightweight, Steel] X
2 Product C 2016-9-12 [Red]
Lightweight Red Steel
0 X X
1 X X
2 X