I have a distance matrix with about 5000 entries, and use scipy\'s hierarchical clustering methods to cluster the matrix. The code I use for this is the following snippet:
One of the dictionary data-structures returned by scipy.cluster.hierarchy.dendrogram has the key ivl
, that the documentation describes as:
a list of labels corresponding to the leaf nodes
You can supply custom labels (using labels=<array of lables>
) as input to the dendrogram function but by default, they are just indices of the original observation. By comparing the original labels/indices and Z1['ivl']
, you can determine what the original entries were.