doparallel

caret train binary glm fails on parallel cluster via doParallel

梦想与她 提交于 2019-12-12 22:58:37
问题 I have seen there are a lot of questions around this topic already but none seems to give a satisfying answer to my problem. I intend to use caret::train() in combination with library doParallel on a Windows machine. The documentation (The caret package: 9 Parallel Processing) tells me that it will run in parallel by default if it finds a registered cluster (although it uses library doMC ). When I attempt setting up a cluster with doParallel and follow the example calculation in its

system.time and parallel package in R sys.child is 0

六眼飞鱼酱① 提交于 2019-12-12 03:38:38
问题 I would like to use system.time in R to get the total CPU time on a multicore function. The problem is that system.time does obviously not capture CPU time spend by the child processes spawned by the parallel package. library(doParallel) cl <- makeCluster(2) registerDoParalllel(2) timings <- system.time(foreach(i = 1:2) %do% rnorm(1e8)) Timings then looks like this > timings user system elapsed 16.883 5.731 22.899 The timings add up. Now if I use parallel processing: timings <- system.time

How to fix “object 'i' not found” in R foreach?

若如初见. 提交于 2019-12-11 15:27:45
问题 I have to run thousands of models with thousands of dependent variables (one model with one denpent variables. The depent variables stand for different genes). So I selected "foreach" and "doParallel" packages to increase my analysis speed. And I am using "lqmm" package, but I kept facing the error. I am thinking there might be way(s) to deal with this issue by editing some statements in foreach function. I have searched all related questions in this aspect, but still get stuck. I am showing

do I still need to makeCluster if I'm already doing registerDoParallel(cl)

可紊 提交于 2019-12-11 10:45:18
问题 Reading the vignette for doparallel. Are the following two code blocks one and the same? library(doparallel) no_cores <- 8 cl <- makeCluster(no_cores) registerDoParallel(cl) pieces <- foreach(i = seq_len(length(pieces))) %dopar% { # do stuff} Is above just the same as this: library(doparallel) registerDoParallel(cores = 8) pieces <- foreach(i = seq_len(length(pieces))) %dopar% { # do stuff} Must I makeCluster() when using doparallel if I want to use multiple cores? or is the single line

assembling a matrix from diagonal slices with mclapply or %dopar%, like Matrix::bandSparse

╄→尐↘猪︶ㄣ 提交于 2019-12-11 08:37:12
问题 Right now I'm working with some huge matrices in R and I need to be able to reassemble them using diagonal bands. For programming reasons (to avoid having to do n*n operations for a matrix of size n (millions of calculations), I wanted to just do 2n calculations (thousands of calculations) and thus chose to do run my function on the diagonal bands of the matrix. Now, I have the results, but need to take these matrix slices and assemble them in a way that allows me to use multiple processors.

R: How to Parallelize multi-panel plotting with lattice in R 3.2.1?

十年热恋 提交于 2019-12-10 22:18:48
问题 I am new to R programming and wanted to know how to run in parallel plot on 12 trellis objects made with lattice package. Basically, after a lot of pre-processing steps, I have the following commands: plot(adhd_plot, split = c(1,1,4,3)) #plot adhd trellis object at 1,1 in a grid of 4 by 3 i.e 4 COLUMNS x 3 ROWS plot(bpd_plot, split = c(2,1,4,3), newpage = F) #plot bpd trellis object in 2nd Column in a grid of 4colx3row plot(bmi_plot, split = c(3,1,4,3), newpage = F) plot(dbp_plot, split = c(4

Why does foreach %dopar% get slower with each additional node?

房东的猫 提交于 2019-12-10 12:28:53
问题 I wrote a simple matrix multiplication to test out multithreading/parallelization capabilities of my network and I noticed that the computation was much slower than expected. The Test is simple : multiply 2 matrices (4096x4096) and return the computation time. Neither the matrices nor results are stored. The computation time is not trivial (50-90secs depending on your processor). The Conditions : I repeated this computation 10 times using 1 processor, split these 10 computations to 2

foreach-loop (R/doParallel package) fails with big number of iterations

空扰寡人 提交于 2019-12-10 11:54:54
问题 I have the following R-code: library(doParallel) cl <- makeCluster(detectCores()-4, outfile = "") registerDoParallel(cl) calc <- function(i){ ... #returns a dataframe } system.time( res<- foreach( i = 1:106800, .verbose = TRUE) %dopar% calc(i) ) stopCluster(cl) If I run that code from 1:5, it finishes successfully. The same happens if I run that code from 106000 - 106800. But it fails if I run the full vector 1-106800, or even 100000-106800 (these are not the very exact numbers I am working

Using foreach loop in r returning NA

喜你入骨 提交于 2019-12-10 11:44:36
问题 I would like to use the "foreach" loop in R (package foreach + doParallel) but in my work i found that the loop returns some NA and the classic "for" loop returns the value I want : library(foreach) library(doParallel) ncore=as.numeric(Sys.getenv('NUMBER_OF_PROCESSORS'))-1 registerDoParallel(cores=ncore) B=2 a = vector() b = vector() foreach(i = 1:B, .packages = "ez",.multicombine = T,.inorder = T, .combine = 'c')%dopar%{ a[i] = i + 1 return(a) } for(i in 1:B){ b[i] = i + 1 b } As you can see

load-balancing in R foreach loops

可紊 提交于 2019-12-10 10:43:42
问题 Is there a way to modify how R foreach loop does load balancing with doParallel backend ? When parallelizing tasks that have very different execution time, it can happen all nodes but one have finished their tasks while the last one still have several tasks to do. Here is a toy example: library(foreach) library(doParallel) registerDoParallel(4) waittime = c(10,1,1,1,10,1,1,1,10,1,1,1,10,1,1,1) w = iter(waittime) foreach(i=w) %dopar% { message(paste("waiting",i, "on",Sys.getpid())) Sys.sleep(i