Challenge: recoding a data.frame() — make it faster

后端 未结 6 1636
借酒劲吻你
借酒劲吻你 2021-02-04 06:43

Recoding is a common practice for survey data, but the most obvious routes take more time than they should.

The fastest code that accomplishes the same task with the pr

6条回答
  •  -上瘾入骨i
    2021-02-04 06:56

    My computer is obviously much slower, but structure is a pretty fast way to do this:

    > system.time({
    + dat1 <- dat
    + for(x in 1:ncol(dat)) {
    +   dat1[,x] <- factor(dat1[,x], labels=re.codes)
    +   }
    + })
       user  system elapsed 
     11.965   3.172  15.164 
    > 
    > system.time({
    + m <- as.matrix(dat)
    + dat2 <- data.frame( matrix( re.codes[m], nrow = nrow(m)))
    + })
       user  system elapsed 
      2.100   0.516   2.621 
    > 
    > system.time(dat3 <- data.frame(lapply(dat, structure, class='factor', levels=re.codes)))
       user  system elapsed 
      0.484   0.332   0.820 
    
    # this isn't because the levels get re-ordered
    > all.equal(dat1, dat2)
    
    > all.equal(dat1, dat3)
    [1] TRUE
    

提交回复
热议问题