Dendrogram or Other Plot from Distance Matrix

孤者浪人 提交于 2019-12-04 05:25:07
Warren Weckesser

The first argument of linkage should not be the square distance matrix. It must be the condensed distance matrix. In your case, that would be np.array([2.0, 3.8459253727671276e-16, 2]). You can convert from the square distance matrix to the condensed form using scipy.spatial.distance.squareform

If you pass a two dimensional array to linkage with shape (m, n), it treats it as an array of m points in n-dimensional space and it computes the distances of those points itself. That's why you didn't get an error when you passed in the square distance matrix--but you got an incorrect plot. (This is an undocumented "feature" of linkage.)

Also note that because the distance 3.8e-16 is so small, the horizontal line associated with the link between points 0 and 2 might not be visible in the plot--it is on the x axis.

Here's a modified version of your script. For this example, I've changed that tiny distance to 0.1, so the associated cluster is not obscured by the x axis.

import numpy as np

from scipy.cluster.hierarchy import dendrogram, linkage
from scipy.spatial.distance import squareform

import matplotlib.pyplot as plt


mat = np.array([[0.0, 2.0, 0.1], [2.0, 0.0, 2.0], [0.1, 2.0, 0.0]])
dists = squareform(mat)
linkage_matrix = linkage(dists, "single")
dendrogram(linkage_matrix, labels=["0", "1", "2"])
plt.title("test")
plt.show()

Here is the plot created by the script:

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!