Python pandas: Finding cosine similarity of two columns

别等时光非礼了梦想. 提交于 2019-11-29 10:15:25

问题


Suppose I have two columns in a python pandas.DataFrame:

          col1 col2
item_1    158  173
item_2     25  191
item_3    180   33
item_4    152  165
item_5     96  108

What's the best way to take the cosine similarity of these two columns?


回答1:


Is that what you're looking for?

from scipy.spatial.distance import cosine
from pandas import DataFrame


df = DataFrame({"col1": [158, 25, 180, 152, 96],
                "col2": [173, 191, 33, 165, 108]})

print(1 - cosine(df["col1"], df["col2"]))



回答2:


You can also use cosine_similarity or other similarity metrics from sklearn.metrics.pairwise.

from sklearn.metrics.pairwise import cosine_similarity

cosine_similarity(df.col1, df.col2)
Out[4]: array([[0.7498213]])


来源:https://stackoverflow.com/questions/25736861/python-pandas-finding-cosine-similarity-of-two-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!