Computing similarity between two lists

后端未结

关注

 7  2119

EDIT: as everyone is getting confused, I want to simplify my question. I have two ordered lists. Now, I just want to compute how similar one list is to the other.

相关标签:

7条回答

南笙

2020-12-08 16:03
As you said, you want to compute how similar one list is to the other. I think simplistically, you can start by counting the number of Inversions. There's a O(NlogN) divide and conquer approach to this. It is a very simple approach to measure the "similarity" between two lists.

e.g. you want to compare how 'similar' the music tastes are for two persons on a music website, you take their rankings of a set of songs and count the no. of inversions in it. Lesser the count, more 'similar' their taste is.

since you are already considering the "state of the art system" to be a benchmark of correctness, counting Inversions should give you a basic measure of 'similarity' of your ranking. Of course this is just a starters approach, but you can build on it as how strict you want to be with the "inversion gap" etc.
```
    D1 D2 D3 D4 D5 D6
    -----------------
R1: 1, 7, 4, 5, 8, 9  [Rankings from 'state of the art' system]
R2: 1, 7, 5, 4, 9, 6  [ your Rankings]
```
Since rankings are in order of documents you can write your own comparator function based on R1 (ranking of the "state of the art system" and hence count the inversions comparing to that comparator.

You can "penalize" 'similarity' for each inversions found: i < j but R2[i] >' R2[j]
( >' here you use your own comparator)

Links you may find useful:
Link1
Link2
Link3
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2