Perform an operation on all pairs of rows in a column

前端 未结 2 605
隐瞒了意图╮
隐瞒了意图╮ 2021-01-21 13:10

Assume the following DataFrame:

id    A   
1     0
2     10
3     200
4     3000

I would like to make a calculation betweeen all rows to all ot

2条回答
  •  悲&欢浪女
    2021-01-21 13:17

    Use broadcasted subtraction, then np.tril_indices to extract the lower diagonal (positive values).

    # <= 0.23 
    # u = df['A'].values
    # 0.24+
    u = df['A'].to_numpy()  
    u2 = (u[:,None] - u)   
    
    pd.Series(u2[np.tril_indices_from(u2, k=-1)])
    
    0      10
    1     200
    2     190
    3    3000
    4    2990
    5    2800
    dtype: int64
    

    Or, use subtract.outer to avoid the conversion to array beforehand.

    u2 = np.subtract.outer(*[df.A]*2)
    pd.Series(u2[np.tril_indices_from(u2, k=-1)])
    

    If you need the index as well, use

    idx = np.tril_indices_from(u2, k=-1)
    pd.DataFrame({
        'val':u2[np.tril_indices_from(u2, k=-1)], 
        'row': idx[0], 
        'col': idx[1]
    })
    
        val  row  col
    0    10    1    0
    1   200    2    0
    2   190    2    1
    3  3000    3    0
    4  2990    3    1
    5  2800    3    2
    

提交回复
热议问题