Converting package using S3 to S4 classes, is there going to be performance drop?

后端未结

关注

 3  935

I have an R package which currently uses S3 class system, with two different classes and several methods for generic S3 functions like plot, logL


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  萌比男神i        
                
              
                            
                2021-02-14 02:33
              
            
            
                                                                       
(This is pretty close to the boundary of a "question likely to elicit opinion" but believe it is an important issue, one for which you have offered code and data and useful citations, and so I hope there are no votes to close.) 

I admit that I have never really understood the S4 model of programming. However, what Chambers' post was saying is that @<-, i.e. slot assignment, was being re-implemented as a primitive rather than as a closure so that it would not require a complete copy of an object when one component was altered. So the earlier state of affairs will be altered in R 3.0.0 beta. On my machine (a 5 year-old MacPro running R 3.0.0 beta) the relative difference was even greater. However, I did not think that was necessarily a good test, since it was not altering an existing copy of a named object with multiple slots. 

res <-microbenchmark(structure(list(x=rep(1, 10^7)), class="MyS3Class"),
                new("MyClass", x=rep(1, 10^7)) )
summary(res)[ ,"median"]
#[1] 145.0541 103.4064


I think you should go with S4 since your brain structure is more flexible than mine and there are a lot of very smart people, Douglas Bates and Martin Maechler to name two other than John Chambers, who have used S4 methods for packages that require heavy processing. The Matrix and lme4 package both use S4 methods for critical functions.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  我在风中等你        
                
              
                            
                2021-02-14 02:46
              
            
            
                                                                       

First of all, you can easily have S3 methods for S4 classes:

> extract <- function (x, ...) x@x
> setGeneric ("extr4", def=function (x, ...){})
[1] "extr4"
> setMethod ("extr4", signature= "MyClass", definition=extract)
[1] "extr4"
> `[.MyClass` <- extract
> `[.MyS3Class` <- function (x, ...) x$x
> microbenchmark (objS3[], objS4 [], extr4 (objS4), extract (objS4))
Unit: nanoseconds
           expr   min      lq  median      uq   max neval
        objS3[]  6775  7264.5  7578.5  8312.0 39531   100
        objS4[]  5797  6705.5  7124.0  7404.0 13550   100
   extr4(objS4) 20534 21512.0 22106.0 22664.5 54268   100
 extract(objS4)   908  1188.0  1328.0  1467.0 11804   100





edit: due to Hadley's comment, change the experiment to plot:

> `plot.MyClass` <- extract
> `plot.MyS3Class` <- function (x, ...) x$x
> microbenchmark (plot (objS3), plot (objS4), extr4 (objS4), extract (objS4))
Unit: nanoseconds
           expr   min      lq median      uq     max neval
    plot(objS3) 28915 30172.0  30591 30975.5 1887824   100
    plot(objS4) 25353 26121.0  26471 26960.0  411508   100
   extr4(objS4) 20395 21372.5  22001 22385.5   31359   100
 extract(objS4)   979  1328.0   1398  1677.0    3982   100


for an S4 method for plot I get:

    plot(objS4) 19835 20428.5 21336.5 22175.0 58876   100


So yes, [ has an exceptionally fast dispatch mechanism (which is good, because I think extraction and the corresponding replacement functions are among the most frequently called methods. But no, S4 dispatch isn't slower than S3 dispatch.



Here the S3 method on the S4 object is as fast as the S3 method on the S3 object. However, calling without dispatch is still faster.


there are some things that work much better as S3 such as as.matrix or as.data.frame

For some reason, defining these as S3 means that e.g. lm (formula, objS4) will work out of the box. This doesn't work with as.data.frame being defined as S4 method.
Also it is much more convenient to call debug on a S3 method. 
some other things will not work with S3, e.g. dispatching on the second argument.
Whether there will be any noticable drop in performance obviously depends on your class, that is, what kind of structures you have, how large the objects are and how often methods are called. A few μs of method dispatch won't matter with a calculation of ms or even s. But μs do matter when a function is called billions of times.
One thing that caused noticable performance drop for some functions that are called often ([) is S4 validation (a fair number of checks done in validObject) - however, I'm glad to have it, so I use it.Internally I use workhorse functions that skip this step.
In case you have large data and call-by-reference would help your performance, you may want to have a look at reference classes. I've never really worked with them so far, so I cannot comment on this.

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  耶瑟儿～        
                
              
                            
                2021-02-14 02:49
              
            
            
                                                                       
If you are concerned about performance, benchmark it. If you really need multiple inheritance or multiple dispatch, use S4. Otherwise use S3.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复