问题
I am using scipy.cluster.hierarchy.linkage as a clustering algorithm and pass the result linkage matrix to scipy.cluster.hierarchy.fcluster, to get the flattened clusters, for various thresholds.
I would like to calculate the Silhouette score of the results and compare them to choose the best threshold and prefer not to implement it on my own but use scikit-learn's sklearn.metrics.silhouette_score. How can I rearrange my clustering results as an input to sklearn.metrics.silhouette_score?
回答1:
You don't have to.
Results of fcluster can directly be fed into silhouette_score.
distmatrix1 = scipy.spatial.distance.squareform(distmatrix + distmatrix.T)
ddgm = scipy.cluster.hierarchy.linkage(distmatrix1, method="average")
nodes = scipy.cluster.hierarchy.fcluster(ddgm, 4, criterion="maxclust")
metrics.silhouette_score(distmatrix + distmatrix.T , nodes, metric='euclidean')
来源:https://stackoverflow.com/questions/27875056/how-to-calculate-silhouette-score-of-the-scipys-fcluster-using-scikit-learn-sil