问题
I would like to use system.time in R to get the total CPU time on a multicore function. The problem is that system.time does obviously not capture CPU time spend by the child processes spawned by the parallel package.
library(doParallel)
cl <- makeCluster(2)
registerDoParalllel(2)
timings <- system.time(foreach(i = 1:2) %do% rnorm(1e8))
Timings then looks like this
> timings
user system elapsed
16.883 5.731 22.899
The timings add up. Now if I use parallel processing:
timings <- system.time(foreach(i = 1:2) %dopar% rnorm(1e8))
> timings
user system elapsed
2.445 3.410 20.347
The user and system time are only capturing the master process. Specifically looking at the timings[4] and [5] shows me that the user.child and sys.child times are 0.
What do I have to do to measure total CPU time in R on parallel processing?
Note: Moving the cluster startup code into the system.time call did not make a difference.
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)
other attached packages:
[1] doParallel_1.0.10 iterators_1.0.8 foreach_1.4.3
回答1:
@chinsoon12 pointed me in the right direction. user.child and sys.child are only populated when the cluster is created by registerDoParallel, e.g.
registerDoParalllel(cores = 2)
timings <- system.time(foreach(i = 1:2) %dopar% rnorm(1e8))
user.self sys.self elapsed user.child sys.child
timings 0.429 1.978 19.378 9.818 1.386
This is why it worked out of the box with doMC where I did not manually start and stop the cluster via the cl variable.
来源:https://stackoverflow.com/questions/42963771/system-time-and-parallel-package-in-r-sys-child-is-0