问题
Suppose there are 4 sets:
s1={1,2,3,4};
s2={2,3,4};
s3={2,3,4,5};
s4={1,3,4,5};
Is there any standard metric to present the similarity degree of this group of 4 sets?
Thank you for the suggestion of Jaccard method. However, it seems pairwise. How can I compute the similarity degree of the whole group of sets?
回答1:
Pairwise, you can compute the Jaccard distance of two sets. It's simply the distance between two sets, if they were vectors of booleans in a space where {1, 2, 3…} are all unit vectors.
回答2:
Your question isn't very specific. But I suppose you mean something like the "edit distance" between them? I.e. how much you need to change s1 to get to s2?
Check out the Wikipedia article on Edit distance.
回答3:
As Tobu said I'd use the Jaccard Index which is just the intersection divided by the union of the sets.
回答4:
you could compute the size of the intersection between each set
回答5:
You could compute the Euclidean distance between them, and build a dendrogram from that to visualize similarity.
来源:https://stackoverflow.com/questions/2035326/computing-degree-of-similarity-among-a-group-of-sets