R: Pass data.frame by reference to a function

前端 未结 1 1044
说谎
说谎 2021-01-18 04:27

I pass a data.frame as parameter to a function that want to alter the data inside:

x <- data.frame(value=c(1,2,3,4))
f <- function(d){
           


        
相关标签:
1条回答
  • 2021-01-18 05:03

    Actually in R (almost) each modification is performed on a copy of the previous data (copy-on-writing behavior).
    So for example inside your function, when you do d$value[i] <-0 actually some copies are created. You usually won't notice that since it's well optimized, but you can trace it by using tracemem function.

    That being said, if your data.frame is not really big you can stick with your function returning the modified object, since it's just one more copy afterall.

    But, if your dataset is really big and doing a copy everytime can be really expensive, you can use data.table, that allows in-place modifications, e.g. :

    library(data.table)
    d <- data.table(value=c(1,2,3,4))
    f <- function(d){
      for(i in 1:nrow(d)) {
        if(d$value[i] %% 2 == 0){
          set(d,i,1L,0) # special function of data.table (see also ?`:=` )
        }
      }
      print(d)
    }
    
    f(d)
    print(d)
    
    # results :
    > f(d)
       value
    1:     1
    2:     0
    3:     3
    4:     0
    > 
    > print(d)
       value
    1:     1
    2:     0
    3:     3
    4:     0
    

    N.B.

    In this specific case, the loop can be replaced with a "vectorized" and more efficient version e.g. :

    d[d$value %% 2 == 0,'value'] <- 0
    

    but maybe your real loop code is much more convoluted and cannot be vectorized easily.

    0 讨论(0)
提交回复
热议问题