Why does word2Vec use cosine similarity?

后端 未结 2 1617
無奈伤痛
無奈伤痛 2021-02-01 21:11

I have been reading the papers on Word2Vec (e.g. this one), and I think I understand training the vectors to maximize the probability of other words found in the same contexts.<

2条回答
  •  终归单人心
    2021-02-01 21:54

    Cosine similarity of two n-dimensional vectors A and B is defined as:

    which simply is the cosine of the angle between A and B.

    while the Euclidean distance is defined as

    Now think about the distance of two random elements of the vector space. For the cosine distance, the maximum distance is 1 as the range of cos is [-1, 1].

    However, for the euclidean distance this can be any non-negative value.

    When the dimension n gets bigger, two randomly chosen points have a cosine distance which gets closer and closer to 90°, whereas points in the unit-cube of R^n have an euclidean distance of roughly 0.41 (n)^0.5 (source)

    TL;DR

    cosine distance is better for vectors in a high-dimensional space because of the curse of dimensionality. (I'm not absolutely sure about it, though)

提交回复
热议问题