Computing similarity between two lists

后端 未结 7 2119
失恋的感觉
失恋的感觉 2020-12-08 15:37

EDIT: as everyone is getting confused, I want to simplify my question. I have two ordered lists. Now, I just want to compute how similar one list is to the other.

Eg

相关标签:
7条回答
  • 2020-12-08 16:03

    As you said, you want to compute how similar one list is to the other. I think simplistically, you can start by counting the number of Inversions. There's a O(NlogN) divide and conquer approach to this. It is a very simple approach to measure the "similarity" between two lists.

    e.g. you want to compare how 'similar' the music tastes are for two persons on a music website, you take their rankings of a set of songs and count the no. of inversions in it. Lesser the count, more 'similar' their taste is.

    since you are already considering the "state of the art system" to be a benchmark of correctness, counting Inversions should give you a basic measure of 'similarity' of your ranking. Of course this is just a starters approach, but you can build on it as how strict you want to be with the "inversion gap" etc.

        D1 D2 D3 D4 D5 D6
        -----------------
    R1: 1, 7, 4, 5, 8, 9  [Rankings from 'state of the art' system]
    R2: 1, 7, 5, 4, 9, 6  [ your Rankings]
    

    Since rankings are in order of documents you can write your own comparator function based on R1 (ranking of the "state of the art system" and hence count the inversions comparing to that comparator.

    You can "penalize" 'similarity' for each inversions found: i < j but R2[i] >' R2[j]
    ( >' here you use your own comparator)

    Links you may find useful:
    Link1
    Link2
    Link3

    0 讨论(0)
提交回复
热议问题