Challenge: recoding a data.frame() — make it faster

后端 未结 6 1637
借酒劲吻你
借酒劲吻你 2021-02-04 06:43

Recoding is a common practice for survey data, but the most obvious routes take more time than they should.

The fastest code that accomplishes the same task with the pr

6条回答
  •  一整个雨季
    2021-02-04 06:52

    The help page for class() says that class<- is deprecated and to use as. methods. I haven't quite figured out why the earlier effort was reporting 0 observations when the data was obviously in the object, but this method results in a complete object:

        system.time({ dat2 <- vector(mode="list", length(dat))
          for (i in 1:length(dat) ){ dat2[[i]] <- dat[[i]]
            storage.mode(dat2[[i]]) <- "integer"
                   attributes(dat2[[i]]) <- list(class="factor", levels=re.codes)}
      names(dat2) <- names(dat)
      dat2 <- as.data.frame(dat2)})
    #--------------------------  
      user  system elapsed 
      0.266   0.290   0.560 
    > str(dat2)
    'data.frame':   250000 obs. of  36 variables:
     $ V1 : Factor w/ 5 levels "This","That",..: 1 2 3 4 5 1 2 3 4 5 ...
     $ V2 : Factor w/ 5 levels "This","That",..: 5 4 3 2 1 5 4 3 2 1 ...
     $ V3 : Factor w/ 5 levels "This","That",..: 1 2 4 5 3 1 2 4 5 3 ...
     $ V4 : Factor w/ 5 levels "This","That",..: 1 2 3 4 5 1 2 3 4 5 ...
     $ V5 : Factor w/ 5 levels "This","That",..: 5 4 3 2 1 5 4 3 2 1 ...
     $ V6 : Factor w/ 5 levels "This","That",..: 1 2 4 5 3 1 2 4 5 3 ...
     $ V7 : Factor w/ 5 levels "This","That",..: 1 2 3 4 5 1 2 3 4 5 ...
     $ V8 : Factor w/ 5 levels "This","That",..: 5 4 3 2 1 5 4 3 2 1 ...
     snipped
    

    All 36 columns are there.

提交回复
热议问题