What is the fastest way to update a data set in R?

后端 未结 2 407
谎友^
谎友^ 2021-01-07 03:08

I have a 20000 * 5 data set. Currently it is being processed in an iterative manner and the data set gets updated continuously on every iteration.

The cells in the

2条回答
  •  执念已碎
    2021-01-07 03:40

    Interestingly enough, if you're using a data.table it doesn't seem to be faster at first glance. Perhaps it's getting faster when using the assignment inside of a loop.

    library(data.table)
    library(microbenchmark)
    dt <- data.table(test)
    
    # Accessing the entry
    dt[765, "C", with = FALSE] 
    
    # Replacing the value with the new one
    # Basic data.table syntax
    dt[i =765, C := C + 25 ]
    
    # Replacing the value with the new one
    # using set() from data.table
    set(dt, i = 765L, j = "C", value = dt[765L,C] + 25)
    
    microbenchmark(
          a = set(dt, i = 765L, j = "C", value = dt[765L,C] + 25)
        , b = dt[i =765, C := C + 25 ]
        , c = test[765, "C"] <- test[765, "C"] + 25
        , times = 1000       
      )
    

    The results from microbenchmark:

                                                       expr     min      lq     mean  median       uq      max neval
     a = set(dt, i = 765L, j = "C", value = dt[765L, C] + 25) 236.357 46.621 266.4188 250.847 260.2050  572.630  1000
     b = dt[i = 765, `:=`(C, C + 25)]                         333.556 345.329 375.8690 351.668 362.6860 1603.482  1000
     c = test[765, "C"] <- test[765, "C"] + 25                73.051  81.805 129.1665  84.220  87.6915 1749.281  1000
    

提交回复
热议问题