Assume the following DataFrame:
id A
1 0
2 10
3 200
4 3000
I would like to make a calculation betweeen all rows to all ot
IIUC itertools
import itertools
s=list(itertools.combinations(df.index, 2))
pd.Series([df.A.loc[x[1]]-df.A.loc[x[0]] for x in s ])
Out[495]:
0 10
1 200
2 3000
3 190
4 2990
5 2800
dtype: int64
Update
s=list(itertools.combinations(df.index, 2))
pd.DataFrame([x+(df.A.loc[x[1]]-df.A.loc[x[0]],) for x in s ])
Out[518]:
0 1 2
0 0 1 10
1 0 2 200
2 0 3 3000
3 1 2 190
4 1 3 2990
5 2 3 2800
Use broadcasted subtraction, then np.tril_indices
to extract the lower diagonal (positive values).
# <= 0.23
# u = df['A'].values
# 0.24+
u = df['A'].to_numpy()
u2 = (u[:,None] - u)
pd.Series(u2[np.tril_indices_from(u2, k=-1)])
0 10
1 200
2 190
3 3000
4 2990
5 2800
dtype: int64
Or, use subtract.outer
to avoid the conversion to array beforehand.
u2 = np.subtract.outer(*[df.A]*2)
pd.Series(u2[np.tril_indices_from(u2, k=-1)])
If you need the index as well, use
idx = np.tril_indices_from(u2, k=-1)
pd.DataFrame({
'val':u2[np.tril_indices_from(u2, k=-1)],
'row': idx[0],
'col': idx[1]
})
val row col
0 10 1 0
1 200 2 0
2 190 2 1
3 3000 3 0
4 2990 3 1
5 2800 3 2