Cosine Similarity and LDA topics

余生长醉 提交于 2019-12-10 11:06:39

问题


I want to compute Cosine Similarity between LDA topics. In fact, gensim function .matutils.cossim can do it but I dont know which parameter (vector ) I can use for this function?

Here is a snap of code :

import numpy as np
import lda
from sklearn.feature_extraction.text import CountVectorizer

cvectorizer = CountVectorizer(min_df=4, max_features=10000, stop_words='english')
cvz = cvectorizer.fit_transform(tweet_texts_processed)

n_topics = 8
n_iter = 500
lda_model = lda.LDA(n_topics=n_topics, n_iter=n_iter)
X_topics = lda_model.fit_transform(cvz)

n_top_words = 6
topic_summaries = []

topic_word = lda_model.topic_word_  # get the topic words
vocab = cvectorizer.get_feature_names()
for i, topic_dist in enumerate(topic_word):
    topic_words = np.array(vocab)[np.argsort(topic_dist)][:-(n_top_words+1):-1]
    topic_summaries.append(' '.join(topic_words))
    print('Topic {}: {}'.format(i, ' '.join(topic_words)))

doc_topic = lda_model.doc_topic_
lda_keys = []
for i, tweet in enumerate(tweets):
    lda_keys += [X_topics[i].argmax()]

import gensim
from gensim import corpora, models, similarities
#Cosine Similarity between LDA topics
 **sim = gensim.matutils.cossim(LDA_topic[1], LDA_topic[2])** 

来源:https://stackoverflow.com/questions/48115965/cosine-similarity-and-lda-topics

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!