kmeans: Quick-TRANSfer stage steps exceeded maximum

前端未结

关注

 4  365

I am running k-means clustering in R on a dataset with 636,688 rows and 7 columns using the standard stats package: kmeans(dataset, centers = 100, nstart = 25


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  名媛妹妹        
                
              
                            
                2021-02-01 15:28
              
            
            
                                                                       
Had the same problem, seems to have something to do with available memory. 

Running Garbage Collection before the function worked for me:

gc()


or reference:

Increasing (or decreasing) the memory available to R processes
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  无人共我        
                
              
                            
                2021-02-01 15:28
              
            
            
                                                                       
I got the same error message, but in my case it helped to increase the number of iterations iter.max. That contradicts the theory of memory overload.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感动是毒        
                
              
                            
                2021-02-01 15:29
              
            
            
                                                                       
@jlhoward's comment:

Try

kmeans(dataset, algorithm="Lloyd", ..)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  攒了一身酷        
                
              
                            
                2021-02-01 15:35
              
            
            
                                                                       
I just had the same issue.

See the documentation of kmeans in R via ?kmeans:


  The Hartigan-Wong algorithm
       generally does a better job than either of those, but trying
       several random starts (‘nstart’> 1) is often recommended.  In rare
       cases, when some of the points (rows of ‘x’) are extremely close,
       the algorithm may not converge in the “Quick-Transfer” stage,
       signalling a warning (and returning ‘ifault = 4’).  Slight
       rounding of the data may be advisable in that case.


In these cases, you may need to switch to the Lloyd or MacQueen algorithms.

The nasty thing about R here is that it continues with a warning that may go unnoticed. For my benchmark purposes, I consider this to be a failed run, and thus I use:

if (kms$ifault==4) { stop("Failed in Quick-Transfer"); }


Depending on your use case, you may want to do something like

if (kms$ifault==4) { kms = kmeans(X, kms$centers, algorithm="MacQueen"); }


instead, to continue with a different algorithm.

If you are benchmarking K-means, note that R uses iter.max=10 per default. It may take much more than 10 iterations to converge.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复