I need to find the cosine similarity between two frequency vectors in MATLAB.
Example vectors:
a = [2,3,4,4,6,1]
b = [1,3,2,4,6,3]
How
Take a quick look at the mathematical definition of Cosine similarity.
From the definition, you just need the dot product of the vectors divided by the product of the Euclidean norms of those vectors.
% MATLAB 2018b
a = [2,3,4,4,6,1];
b = [1,3,2,4,6,3];
cosSim = sum(a.*b)/sqrt(sum(a.^2)*sum(b.^2)); % 0.9436
Alternatively, you could use
cosSim = (a(:).'*b(:))/sqrt(sum(a.^2)*sum(b.^2)); % 0.9436
which gives the same result.
After reading this correct answer, to avoid sending you to another castle I've added another approach using MATLAB's built-in linear algebra functions, dot() and norm().
cosSim = dot(a,b)/(norm(a)*norm(b)); % 0.9436
See also the tag-wiki for cosine-similarity.
Performance by Approach:
sum(a.*b)/sqrt(sum(a.^2)*sum(b.^2))
(a(:).'*b(:))/sqrt(sum(a.^2)*sum(b.^2))
dot(a,b)/(norm(a)*norm(b))
Each point represents the geometric mean of the computation times for 10 randomly generated vectors.
If you have the Statistics toolbox, you can use the pdist2 function with the 'cosine'
input flag, which gives 1 minus the cosine similarity:
a = [2,3,4,4,6,1];
b = [1,3,2,4,6,3];
result = 1-pdist2(a, b, 'cosine');