I have a dataframe like this:
--------------------------------------------------------------------
Product ProductType SKU Size
-------
This is open to bugs so use with caution:
Convert Product column to a collection of lists whose sizes are the same with the lists in other columns (say, column SKU. This will not work if the lists in SKU and Size are of different lengths)
df["Product"] = df["Product"].map(list) * df["SKU"].map(len)
Out[184]:
SKU Size Product
0 [111, 222, 333, 444] [XS, S, M, L] [a, a, a, a]
1 [555, 666] [M, L] [b, b]
Take the sum of the columns (it will extend the lists) and pass that to the dataframe constructor with to_dict()
:
pd.DataFrame(df.sum().to_dict())
Out[185]:
Product SKU Size
0 a 111 XS
1 a 222 S
2 a 333 M
3 a 444 L
4 b 555 M
5 b 666 L
Edit:
For several columns, you can define the columns to be repeated:
cols_to_be_repeated = ["Product", "ProductType"]
Save the rows that has None values in another dataframe:
na_df = df[pd.isnull(df["SKU"])].copy()
Drop None's from the original dataframe:
df.dropna(inplace = True)
Iterate over those columns:
for col in cols_to_be_repeated:
df[col] = df[col].map(lambda x: [x]) * df["SKU"].map(len)
And use the same approach:
pd.concat([pd.DataFrame(df.sum().to_dict()), na_df])
Product ProductType SKU Size
0 T-shirt Top 111.0 XS
1 T-shirt Top 222.0 S
2 T-shirt Top 333.0 M
3 T-shirt Top 444.0 L
4 Pant(Flared) Bottoms 555.0 M
5 Pant(Flared) Bottoms 666.0 L
2 Sweater Top NaN None
It might be better to work on a copy of the original dataframe.