I have a big performance problem in R. I wrote a function that iterates over a data.frame
object. It simply adds a new column to a data.frame
and a
The answers here are great. One minor aspect not covered is that the question states "My PC is still working (about 10h now) and I have no idea about the runtime". I always put in the following code into loops when developing to get a feel for how changes seem to affect the speed and also for monitoring how long it will take to complete.
dayloop2 <- function(temp){
for (i in 1:nrow(temp)){
cat(round(i/nrow(temp)*100,2),"% \r") # prints the percentage complete in realtime.
# do stuff
}
return(blah)
}
Works with lapply as well.
dayloop2 <- function(temp){
temp <- lapply(1:nrow(temp), function(i) {
cat(round(i/nrow(temp)*100,2),"% \r")
#do stuff
})
return(temp)
}
If the function within the loop is quite fast but the number of loops is large then consider just printing every so often as printing to the console itself has an overhead. e.g.
dayloop2 <- function(temp){
for (i in 1:nrow(temp)){
if(i %% 100 == 0) cat(round(i/nrow(temp)*100,2),"% \r") # prints every 100 times through the loop
# do stuff
}
return(temp)
}