Hierarchical Dirichlet Process Gensim topic number independent of corpus size

前端未结

关注

 7  1721

余生分开走 2021-02-04 07:20

I am using the Gensim HDP module on a set of documents.

>>> hdp = models.HdpModel(corpusB, id2word=dictionaryB)
>>> topics = hdp.print_topics(


      
      
        
          7条回答        

        
                    
            
            
                         
                
              
              
                
                   别那么骄傲
                                             
                
                
                (楼主)
            
              
              
                2021-02-04 08:12
              

            
            
                        
I think you misunderstood the operation performed by the called method. Directly from the documentation you can see: 


  Alias for show_topics() that prints the top n most probable words for topics number of topics to log. Set topics=-1 to print all topics. 


You trained the model without specifying the truncation level on the number of topics and the default one is 150. Calling the print_topics with topics=-1 you'll get the top 20 words for each topic , in your case 150 topics.

I'm still a newbie of the library, so maybe I' wrong
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它7个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复