I like to use a progress bar while running slow for
loops. This could be done easily with several helpers, but I do like the tkProgressBar
from tcl
I propose hereby two solutions that use the standard for
syntax, both are using the great package progress from Gábor Csárdi and Rich FitzJohn
for
function to wrap around base::for
and support progress bars.for<-
, and wrap around base::for
using syntax pb -> for(it in seq) {exp}
where pb
is progress bar built with progress::progress_bar$new()
.Both solutions behave as standard for calls :
I packaged my solution and will demo them below then will go through the code
#devtools::install_github("moodymudskipper/pbfor")
library(pbfor)
pb_for()
By default pb_for()
will override the for
function for one run only.
pb_for()
for (i in 1:10) {
# DO SOMETHING
Sys.sleep(0.5)
}
Using parameters from progress::progress_bar$new()
:
pb_for(format = "Working hard: [:bar] :percent :elapsed",
callback = function(x) message("Were'd done!"))
for (i in 1:10) {
# DO SOMETHING
Sys.sleep(0.5)
}
for<-
The only restriction compared to a standard for
call is that the first argument must exist and can't be NULL
.
i <- NA
progress_bar$new() -> for (i in 1:10) {
# DO SOMETHING
Sys.sleep(0.5)
}
We can define a custom progress bar, and maybe define it conveniently in an initialisation script or in one's R profile.
pb <- progress_bar$new(format = "Working hard: [:bar] :percent :elapsed",
callback = function(x) ("Were'd done!"))
pb -> for (i in 1:10) {
# DO SOMETHING
Sys.sleep(0.5)
}
For nested progress bars we can use the following trick :
pbi <- progress_bar$new(format = "i: [:bar] :percent\n\n")
pbj <- progress_bar$new(format = "j: [:bar] :percent ")
i <- NA
j <- NA
pbi -> for (i in 1:10) {
pbj -> for (j in 1:10) {
# DO SOMETHING
Sys.sleep(0.1)
}
}
note that due to operator precedence the only way to call for<-
and benefit from the syntax of for
calls is to use the left to right arrow ´->´.
pb_for()
pb_for()
creates a for
function object in its parent environment, then the new for
:
`*pb*`$tick()
at the end of the loop content expressionbase::`for`
in a clean environmentonce
is TRUE
(the default)It's generally sensitive to override an operator, but it cleans after itself and won't affect the global environment if used in a function so I think it's safe enough to use.
for<-
This approach :
for
It has a few drawbacks however:
fun<-
).What it does :
for<-
After this it's similar to what is described for pb_for()
in the section above.
pb_for()
pb_for <-
function(
# all args of progress::progress_bar$new() except `total` which needs to be
# infered from the 2nd argument of the `for` call, and `stream` which is
# deprecated
format = "[:bar] :percent",
width = options("width")[[1]] - 2,
complete = "=",
incomplete = "-",
current =">",
callback = invisible, # doc doesn't give default but this seems to work ok
clear = TRUE,
show_after = .2,
force = FALSE,
# The only arg not forwarded to progress::progress_bar$new()
# By default `for` will self detruct after being called
once = TRUE) {
# create the function that will replace `for`
f <- function(it, seq, expr){
# to avoid notes at CMD check
`*pb*` <- IT <- SEQ <- EXPR <- NULL
# forward all arguments to progress::progress_bar$new() and add
# a `total` argument computed from `seq` argument
pb <- progress::progress_bar$new(
format = format, width = width, complete = complete,
incomplete = incomplete, current = current,
callback = callback,
clear = clear, show_after = show_after, force = force,
total = length(seq))
# using on.exit allows us to self destruct `for` if relevant even if
# the call fails.
# It also allows us to send to the local environment the changed/created
# variables in their last state, even if the call fails (like standard for)
on.exit({
vars <- setdiff(ls(env), c("*pb*"))
list2env(mget(vars,envir = env), envir = parent.frame())
if(once) rm(`for`,envir = parent.frame())
})
# we build a regular `for` loop call with an updated loop code including
# progress bar.
# it is executed in a dedicated environment and the progress bar is given
# a name unlikely to conflict
env <- new.env(parent = parent.frame())
env$`*pb*` <- pb
eval(substitute(
env = list(IT = substitute(it), SEQ = substitute(seq), EXPR = substitute(expr)),
base::`for`(IT, SEQ,{
EXPR
`*pb*`$tick()
})), envir = env)
}
# override `for` in the parent frame
assign("for", value = f,envir = parent.frame())
}
for<-
(and fetch_name()
)`for<-` <-
function(it, seq, expr, value){
# to avoid notes at CMD check
`*pb*` <- IT <- SEQ <- EXPR <- NULL
# the symbol fed to `it` is unknown, R uses `*tmp*` for assignment functions
# so we go get it by inspecting the memory addresses
it_chr <- fetch_name(it)
it_sym <-as.symbol(it_chr)
# complete the progress bar with the `total` parameter
# we need to clone it because progress bars are environments and updated
# by reference
pb <- value$clone()
pb$.__enclos_env__$private$total <- length(seq)
# when the script ends, even with a bug, the values that have been changed
# are written to the parent frame
on.exit({
vars <- setdiff(ls(env), c("*pb*"))
list2env(mget(vars, env),envir = parent.frame())
})
# computations are operated in a separate environment so we don't pollute it
# with it, seq, expr, value, we need the progress bar so we name it `*pb*`
# unlikely to conflict by accident
env <- new.env(parent = parent.frame())
env$`*pb*` <- pb
eval(substitute(
env = list(IT = it_sym, SEQ = substitute(seq), EXPR = substitute(expr)),
base::`for`(IT, SEQ,{
EXPR
`*pb*`$tick()
})), envir = env)
# because of the `fun<-` syntax we need to return the modified first argument
invisible(get(it_chr,envir = env))
}
helpers:
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
address2 <- getFromNamespace("address2", "pryr")
If you use the plyr
family of commands instead of a for loop (generally a good idea if possible), you get as an added bonus a whole system of progress bars.
R.utils
also has some progress bars built into it, and there exist instructions for using them in for loops.