Label Ordering in Scipy Dendrogram

人盡茶涼 提交于 2019-12-13 03:17:22

问题


In python, I have an N by N distance matrix dmat, where dmat[i,j] encodes the distance from entity i to entity j. I'd like to view a dendrogram. I did:

from scipy.cluster.hierarchy import dendrogram, linkage
import matplotlib.pylab as plt

labels=[name of entity 1,2,3,...]

Z=linkage(dmat)
dn=dendrogram(Z,labels=labels)
plt.show()

But the label ordering looks wrong. There are entities which are very close from dmat, but that's not reflected in the dendrogram. What's going on?


回答1:


The first argument to linkage must be either the distances in condensed format, or the array of points being clustered. If you pass the square (N x N) distance matrix, linkage interprets it as N points in N-dimensional space.

You can convert from your square matrix to the condensed form with scipy.spatial.distance.squareform.

Add this to the beginning of your file

from scipy.spatial.distance import squareform

and replace this

Z=linkage(dmat)

with

Z = linkage(squareform(dmat))


来源:https://stackoverflow.com/questions/48331537/label-ordering-in-scipy-dendrogram

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!