Inefficiency of topic modelling for text clustering

后端未结

关注

 1  1442

春和景丽 2021-01-25 23:38

I tried doing text clustering using LDA, but it isn\'t giving me distinct clusters. Below is my code

#Import libraries
from gensim import corpora, models
import


      
      
        
          1条回答        

        
                    
            
            
                         
                
              
              
                
                   后悔当初
                                             
                
                
                (楼主)
            
              
              
                2021-01-25 23:48
              

            
            
                        
That is just realistic.

Neither documents or words are usually uniquely assignable to a single cluster.

If you'd manually label some data, you will also quickly find some documents that cannot be clearly labeled as one or the other. So it's good I'd the algorithm doesn't pretend there were a good unique assignment.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                    
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复