I know that scipy.cluster.hierarchy focused on dealing with the distance matrix. But now I have a similarity matrix... After I plot it by using Dendrogram, something weird just
linkage
expects "distances", not "similarities". To convert your matrix to something like a distance matrix, you can subtract it from 1:
dist = 1 - similarityMatrix
linkage
does not accept a square distance matrix. It expects the distance data to be in "condensed" form. You can get that using scipy.spatial.distance.squareform
:
from scipy.spatial.distance import squareform
dist = 1 - similarityMatrix
condensed_dist = squareform(dist)
Z_sim = sch.linkage(condensed_dist)
(When you pass a two-dimensional array with shape (m, n) to linkage
, it treats the rows as points in n-dimensional space, and computes the distances internally.)