Jaccard Distance

后端 未结 2 1904
鱼传尺愫
鱼传尺愫 2021-01-01 07:52

I have this problem in calculating Jaccard Distance for Sets (Bit-Vectors):

p1 = 10111;

p2 = 10011.

Size of intersection = 3; (How could we find it

相关标签:
2条回答
  • 2021-01-01 08:04

    Size of intersection = 3; (How could we find it out?)

    Amount of set bits of p1&p2 = 10011

    Size of union = 4, (How could we find it out?)

    Amount of set bits of p1|p2 = 10111

    Vector here means binary array where i-th bit means does i-th element present in this set.

    0 讨论(0)
  • If p1 = 10111 and p2 = 10011,

    The total number of each combination attributes for p1 and p2:

    • M11 = total number of attributes where p1 & p2 have a value 1,
    • M01 = total number of attributes where p1 has a value 0 & p2 has a value 1,
    • M10 = total number of attributes where p1 has a value 1 & p2 has a value 0,
    • M00 = total number of attributes where p1 & p2 have a value 0.

    Jaccard similarity coefficient = J = intersection/union = M11/(M01 + M10 + M11) = 3 / (0 + 1 + 3) = 3/4,

    Jaccard distance = J' = 1 - J = 1 - 3/4 = 1/4, Or J' = 1 - (M11/(M01 + M10 + M11)) = (M01 + M10)/(M01 + M10 + M11) = (0 + 1)/(0 + 1 + 3) = 1/4

    0 讨论(0)
提交回复
热议问题