Challenge: recoding a data.frame() — make it faster

后端未结

关注

 6  1689

借酒劲吻你 2021-02-04 06:43

Recoding is a common practice for survey data, but the most obvious routes take more time than they should.

The fastest code that accomplishes the same task with the pr

6条回答

醉梦人生 (楼主)

2021-02-04 06:50

Combining @DWin's answer, and my answer from Most efficient list to data.frame method?:

system.time({
  dat3 <- list()
  # define attributes once outside of loop
  attrib <- list(class="factor", levels=re.codes)
  for (i in names(dat)) {              # loop over each column in 'dat'
    dat3[[i]] <- as.integer(dat[[i]])  # convert column to integer
    attributes(dat3[[i]]) <- attrib    # assign factor attributes
  }
  # convert 'dat3' into a data.frame. We can do it like this because:
  # 1) we know 'dat' and 'dat3' have the same number of rows and columns
  # 2) we want 'dat3' to have the same colnames as 'dat'
  # 3) we don't care if 'dat3' has different rownames than 'dat'
  attributes(dat3) <- list(row.names=c(NA_integer_,nrow(dat)),
    class="data.frame", names=names(dat))
})
identical(dat2, dat3)  # 'dat2' is from @Dwin's answer

0 讨论(0)

查看其它6个回答