I am trying to calculate semantic similarity between two words. I am using Wordnet-based similarity measures i.e Resnik measure(RES), Lin measure(LIN), Jiang and Conrath measure
Let's consider a single arbitrary similarity measure M
and take an arbitrary word w
.
Define m = M(w,w)
. Then m takes maximum possible value of M
.
Let's define MN
as a normalized measure M
.
For any two words w, u
you can compute MN(w, u) = M(w, u) / m
.
It's easy to see that if M
takes non-negative values, then MN
takes values in [0, 1]
.
In order to compute your own defined measure F
combined of k different measures m_1, m_2, ..., m_k
first normalize independently each m_i
using above method and then define:
alpha_1, alpha_2, ..., alpha_k
such that alpha_i
denotes the weight of i-th measure.
All alphas must sum up to 1, i.e:
alpha_1 + alpha_2 + ... + alpha_k = 1
Then to compute your own measure for w, u
you do:
F(w, u) = alpha_1 * m_1(w, u) + alpha_2 * m_2(w, u) + ... + alpha_k * m_k(w, u)
It's clear that F
takes values in [0,1]