Hierarchical Dirichlet Process Gensim topic number independent of corpus size

前端 未结 7 1707
余生分开走
余生分开走 2021-02-04 07:20

I am using the Gensim HDP module on a set of documents.

>>> hdp = models.HdpModel(corpusB, id2word=dictionaryB)
>>> topics = hdp.print_topics(         


        
相关标签:
7条回答
  • 2021-02-04 08:12

    I think you misunderstood the operation performed by the called method. Directly from the documentation you can see:

    Alias for show_topics() that prints the top n most probable words for topics number of topics to log. Set topics=-1 to print all topics.

    You trained the model without specifying the truncation level on the number of topics and the default one is 150. Calling the print_topics with topics=-1 you'll get the top 20 words for each topic , in your case 150 topics.

    I'm still a newbie of the library, so maybe I' wrong

    0 讨论(0)
提交回复
热议问题