问题
Suppose I have two columns in a python pandas.DataFrame:
col1 col2
item_1 158 173
item_2 25 191
item_3 180 33
item_4 152 165
item_5 96 108
What's the best way to take the cosine similarity of these two columns?
回答1:
Is that what you're looking for?
from scipy.spatial.distance import cosine
from pandas import DataFrame
df = DataFrame({"col1": [158, 25, 180, 152, 96],
"col2": [173, 191, 33, 165, 108]})
print(1 - cosine(df["col1"], df["col2"]))
回答2:
You can also use cosine_similarity
or other similarity metrics from sklearn.metrics.pairwise.
from sklearn.metrics.pairwise import cosine_similarity
cosine_similarity(df.col1, df.col2)
Out[4]: array([[0.7498213]])
来源:https://stackoverflow.com/questions/25736861/python-pandas-finding-cosine-similarity-of-two-columns