For Loop in R (Looking for an alternative)

前端 未结 1 1101
再見小時候
再見小時候 2021-01-27 18:02

The following code runs a loops but the problem is the speed; it takes several hours to finish and I am looking for an alternative so that I don´t have to wait so long.

相关标签:
1条回答
  • 2021-01-27 18:36

    If one runs your code (with smaller number of rows) through a profiler, one sees that the main issue is the rbind in the end, followed by the c mentioned by @Riverarodrigoa:

    We can focus on these two by creating numeric matrices of suitable size and working with those. Only in the end the final data.frame is created:

    options(stringsAsFactors=F)
    N <- 1000
    set.seed(42)
    DAT <- data.frame(ITEM = "x", 
                      CLIENT = as.numeric(1:N), 
                      matrix(sample(1:1000, 60, replace=T), ncol=60, nrow=N, dimnames=list(NULL,paste0('DAY_',1:60))))
    
    nRow <- nrow(DAT)
    TMP  <- matrix(0, ncol = 8, nrow = N,  
                   dimnames = list(NULL, c("Average", "DesvEst", "Max", "Min", "Prom60", "Prom30", "Prom15", "Prom07")))
    DemandMat <- as.matrix(DAT[,3:ncol(DAT)])
    
    for(iROW in 1:nRow){
      Demand <- DemandMat[iROW, ]
    
      ww <- which(!is.na(Demand))
    
      if(length(ww) > 0){
        Average <- round(mean(Demand[ww]),digits=4)
        DesvEst  <- round(sd(Demand,na.rm=T),digits=4)
        Max      <- round(Average + (1 * DesvEst),digits=4)
        Min      <- round(max(Average - (1 * DesvEst), 0),digits=4)
        Demand  <- round(ifelse(is.na(Demand), Demand, ifelse(Demand > Max, Max, ifelse(Demand < Min, Min, Demand))))
        Prom60   <- round(mean(Demand[ww]),digits=4)
        Prom30   <- round(mean(Demand[intersect(ww,(length(Demand) - 29):length(Demand))]),digits=4)
        Prom15   <- round(mean(Demand[intersect(ww,(length(Demand) - 14):length(Demand))]),digits=4)
        Prom07   <- round(mean(Demand[intersect(ww,(length(Demand) - 6):length(Demand))]),digits=4)
    
      }else{
        Average <- DesvEst <- Max <- Min <- Prom60 <- Prom30 <- Prom15 <- Prom07 <- NA
    
      }
      DemandMat[iROW, ] <- Demand 
      TMP[iROW, ] <- c(Average, DesvEst, Max, Min, Prom60, Prom30, Prom15, Prom07)
    }
    DAT <- cbind(DAT[,1:2], DemandMat, TMP)
    

    For 1000 rows this takes about 0.2 s instead of over 4 s. For 10.000 rows I get 2 s instead of 120 s.

    Obviously, this is not really pretty code. One could do this much nicer using tidyverse or data.table. I just find it worth noting that for loops are not necessarily slow in R. But dynamically growing data structures is.

    0 讨论(0)
提交回复
热议问题