I have a pandas dataframe where the first column (CUSTOMER) is the name of the customer and the customer\'s name is repeated once for every product the customer has purchase
Self merge
with crosstab
d1 = df.merge(df, on='Customer').query('Product_x != Product_y')
pd.crosstab(d1.Product_x, d1.Product_y)
Product_y A B C
Product_x
A 0 2 1
B 2 0 1
C 1 1 0
You can see this answer to get a better idea how to speed the crosstab
up. The key insight for this problem was the self merging.