about cosine similarity
问题 I am finding cosine similarity between documents.. I did it like this D1=(8,0,0,1) where 8,0,0,1 are the tf-idf scores of the terms t1, t2, t3 , t4 D2=(7,0,0,1) cos(theta) = (56 + 0 + 0 + 1) / sqrt(64 + 49) sqrt(1 +1 ) which comes out to be cos(theta)= 5 Now what do I evaluate from this value... I don't get it what does cos(theta)=5 signify about the similarity between them... Am I doing things right? 回答1: The denominator is wrong. The cosine similarity is defined as D1 · D2 sim = ———————————