How to compute the cosine_similarity in pytorch for all rows in a matrix with respect to all rows in another matrix

后端 未结 2 1083
谎友^
谎友^ 2021-02-14 13:49

In pytorch, given that I have 2 matrixes how would I compute cosine similarity of all rows in each with all rows in the other.

For example

Given the input =

相关标签:
2条回答
  • 2021-02-14 14:20

    Adding eps for numerical stability base on benjaminplanche's answer:

    def sim_matrix(a, b, eps=1e-8):
        """
        added eps for numerical stability
        """
        a_n, b_n = a.norm(dim=1)[:, None], b.norm(dim=1)[:, None]
        a_norm = a / torch.max(a_n, eps * torch.ones_like(a_n))
        b_norm = b / torch.max(b_n, eps * torch.ones_like(b_n))
        sim_mt = torch.mm(a_norm, b_norm.transpose(0, 1))
        return sim_mt
    
    0 讨论(0)
  • 2021-02-14 14:25

    By manually computing the similarity and playing with matrix multiplication + transposition:

    import torch
    from scipy import spatial
    import numpy as np
    
    a = torch.randn(2, 2)
    b = torch.randn(3, 2) # different row number, for the fun
    
    # Given that cos_sim(u, v) = dot(u, v) / (norm(u) * norm(v))
    #                          = dot(u / norm(u), v / norm(v))
    # We fist normalize the rows, before computing their dot products via transposition:
    a_norm = a / a.norm(dim=1)[:, None]
    b_norm = b / b.norm(dim=1)[:, None]
    res = torch.mm(a_norm, b_norm.transpose(0,1))
    print(res)
    #  0.9978 -0.9986 -0.9985
    # -0.8629  0.9172  0.9172
    
    # -------
    # Let's verify with numpy/scipy if our computations are correct:
    a_n = a.numpy()
    b_n = b.numpy()
    res_n = np.zeros((2, 3))
    for i in range(2):
        for j in range(3):
            # cos_sim(u, v) = 1 - cos_dist(u, v)
            res_n[i, j] = 1 - spatial.distance.cosine(a_n[i], b_n[j])
    print(res_n)
    # [[ 0.9978022  -0.99855876 -0.99854881]
    #  [-0.86285472  0.91716063  0.9172349 ]]
    
    0 讨论(0)
提交回复
热议问题