Hierarchical Dirichlet Process Gensim topic number independent of corpus size
问题 I am using the Gensim HDP module on a set of documents. >>> hdp = models.HdpModel(corpusB, id2word=dictionaryB) >>> topics = hdp.print_topics(topics=-1, topn=20) >>> len(topics) 150 >>> hdp = models.HdpModel(corpusA, id2word=dictionaryA) >>> topics = hdp.print_topics(topics=-1, topn=20) >>> len(topics) 150 >>> len(corpusA) 1113 >>> len(corpusB) 17 Why is the number of topics independent of corpus length? 回答1: @user3907335 is exactly correct here: HDP will calculate as many topics as the