Difference between scipy pairwise distance and X.X+Y.Y - X.Y^t

前端 未结 1 759
鱼传尺愫
鱼传尺愫 2021-01-22 05:00

Let\'s imagine we have data as

d1 = np.random.uniform(low=0, high=2, size=(3,2))
d2 = np.random.uniform(low=3, high=5, size=(3,2))
X = np.vstack((d1,d2))

X
arra         


        
相关标签:
1条回答
  • 2021-01-22 05:36

    pdist(..., metric='seuclidean') computes the standardized Euclidean distance, not the squared Euclidean distance (which is what cal_pdist returns).

    From the docs:

    Y = pdist(X, 'seuclidean', V=None)

    Computes the standardized Euclidean distance. The standardized Euclidean distance between two n-vectors u and v is

       __________________
      √∑(ui−vi)^2 / V[xi]
    

    V is the variance vector; V[i] is the variance computed over all the i’th components of the points. If not passed, it is automatically computed.

    Try passing metric='sqeuclidean', and you will see that both functions return the same result to within rounding error.

    0 讨论(0)
提交回复
热议问题