问题
I'm working on a kmeans functionality for a uni assignment. We need to run euclidean clustering on one set of data and then Jaccard on the other. We need to explore a few different models to evaluate the number of clusters and for the Euclidean it was quite straight forward using sklearn.metrics.silhouette_score, but this does not give the option to use Jaccard distance.
As such I was wondering if anyone has an idea of how to calculate it for Jaccard distance? I have managed to create a matrix for all the distances to each other. I also used the Elbow method in the Euclidean distance, would that be a valid method for Jaccard as well?
回答1:
Use metric="precomputed"
from sklearn.metrics import silhouette_score
silhouette_score(dist_matrix, k_means_cluster_ids, metric="precomputed")
来源:https://stackoverflow.com/questions/60692894/silhouette-score-for-jaccard-distance