Pandas: occurrence matrix from one hot encoding from pandas dataframe

前端 未结 1 1503
臣服心动
臣服心动 2021-01-21 18:12

I have a dataframe, it\'s in one hot format:

dummy_data = {\'a\': [0,0,1,0],\'b\': [1,1,1,0], \'c\': [0,1,0,1],\'d\': [1,1,1,0]}
data = pd.DataFrame(dummy_data)
         


        
1条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-01-21 18:41

    You can have some fun with matrix math!


    u = np.diag(np.ones(df.shape[1], dtype=bool))
    
    df.T.dot(df) * (~u)
    

       a  b  c  d
    a  0  1  0  1
    b  1  0  1  3
    c  0  1  0  1
    d  1  3  1  0
    

    0 讨论(0)
提交回复
热议问题