scipy.cluster.hierarchy: labels seems not in the right order, and confused by the value of the vertical axes

前端 未结 1 517
温柔的废话
温柔的废话 2021-01-28 07:27

I know that scipy.cluster.hierarchy focused on dealing with the distance matrix. But now I have a similarity matrix... After I plot it by using Dendrogram, something weird just

1条回答
  •  [愿得一人]
    2021-01-28 08:28

    • linkage expects "distances", not "similarities". To convert your matrix to something like a distance matrix, you can subtract it from 1:

      dist = 1 - similarityMatrix
      
    • linkage does not accept a square distance matrix. It expects the distance data to be in "condensed" form. You can get that using scipy.spatial.distance.squareform:

      from scipy.spatial.distance import squareform
      
      dist = 1 - similarityMatrix
      condensed_dist = squareform(dist)
      Z_sim = sch.linkage(condensed_dist)
      

      (When you pass a two-dimensional array with shape (m, n) to linkage, it treats the rows as points in n-dimensional space, and computes the distances internally.)

    0 讨论(0)
提交回复
热议问题