Why is using dplyr pipe (%>%) slower than an equivalent non-pipe expression, for high-cardinality group-by?

后端未结

关注

 4  1011

孤城傲影 2020-12-24 14:20

I thought that generally speaking using %>% wouldn\'t have a noticeable effect on speed. But in this case it runs 4x slower.

library(dplyr


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   生来不讨喜
                                             
                
                
                (楼主)
            
              
              
                2020-12-24 14:53
              

            
            
                        
But here is something I have learnt today. I am using R 3.5.0.

Code with x = 100 (1e2)

library(microbenchmark)
library(dplyr)

set.seed(99)
x <- 1e2
z <- sample(x, x / 2, TRUE)
timings <- microbenchmark(
  dp = z %>% unique %>% list, 
  bs = list(unique(z)))

print(timings)

Unit: microseconds
 expr    min      lq      mean   median       uq     max neval
   dp 99.055 101.025 112.84144 102.7890 109.2165 312.359   100
   bs  6.590   7.653   9.94989   8.1625   8.9850  63.790   100


Although, if x = 1e6

Unit: milliseconds
 expr      min       lq     mean   median       uq      max neval
   dp 27.77045 31.78353 35.09774 33.89216 38.26898  52.8760   100
   bs 27.85490 31.70471 36.55641 34.75976 39.12192 138.7977   100

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复