Perform an operation on all pairs of rows in a column

前端 未结 2 606
隐瞒了意图╮
隐瞒了意图╮ 2021-01-21 13:10

Assume the following DataFrame:

id    A   
1     0
2     10
3     200
4     3000

I would like to make a calculation betweeen all rows to all ot

相关标签:
2条回答
  • 2021-01-21 13:15

    IIUC itertools

    import itertools
    
    s=list(itertools.combinations(df.index, 2)) 
    pd.Series([df.A.loc[x[1]]-df.A.loc[x[0]] for x in s ])
    Out[495]: 
    0      10
    1     200
    2    3000
    3     190
    4    2990
    5    2800
    dtype: int64
    

    Update

    s=list(itertools.combinations(df.index, 2)) 
    
    pd.DataFrame([x+(df.A.loc[x[1]]-df.A.loc[x[0]],) for x in s ])
    Out[518]: 
       0  1     2
    0  0  1    10
    1  0  2   200
    2  0  3  3000
    3  1  2   190
    4  1  3  2990
    5  2  3  2800
    
    0 讨论(0)
  • 2021-01-21 13:17

    Use broadcasted subtraction, then np.tril_indices to extract the lower diagonal (positive values).

    # <= 0.23 
    # u = df['A'].values
    # 0.24+
    u = df['A'].to_numpy()  
    u2 = (u[:,None] - u)   
    
    pd.Series(u2[np.tril_indices_from(u2, k=-1)])
    
    0      10
    1     200
    2     190
    3    3000
    4    2990
    5    2800
    dtype: int64
    

    Or, use subtract.outer to avoid the conversion to array beforehand.

    u2 = np.subtract.outer(*[df.A]*2)
    pd.Series(u2[np.tril_indices_from(u2, k=-1)])
    

    If you need the index as well, use

    idx = np.tril_indices_from(u2, k=-1)
    pd.DataFrame({
        'val':u2[np.tril_indices_from(u2, k=-1)], 
        'row': idx[0], 
        'col': idx[1]
    })
    
        val  row  col
    0    10    1    0
    1   200    2    0
    2   190    2    1
    3  3000    3    0
    4  2990    3    1
    5  2800    3    2
    
    0 讨论(0)
提交回复
热议问题