问题
I am trying to parallelize at the tuning hyperparameter level an xgboost
model that I am tuning in mlr
and am trying to parallelize with parallelMap
. I have code that works successfully on my windows machine (with only 8 cores) and would like to make use of a linux server (with 72 cores). I have not been able to successfully gain any computational advantage moving to the server, and I think this is a result of holes in my understanding of the parallelMap parameters.
I do not understand the differences in multicore vs local vs socket as "modes" in parallelMap. Based on my reading, I think that multicore would work for my situation, but I am not sure. I used socket successfully on my windows machine and have tried both socket and multicore on my linux server, with unsuccessful results.
parallelStart(mode="socket", cpu=8, level="mlr.tuneParams")
but it is my understanding that socket might be unnecessary or perhaps slow for parallelizing over many cores that do not need to communicate with each other, as is the case with parallelizing hyperparameter tuning.
To elaborate on my unsuccessful results on my linux server: I am not getting errors, but things that would take <24 hours in serial are taking > 2 weeks in parallel. Looking at the processes, I can see that I am indeed using several cores.
Each individual call xgboost runs in the matter of a few minutes, and I am not trying to speed that up. I am only trying to tune hyperparmeters over several cores.
I was concerned that perhaps my very slow results on my linux server were due to attempts by xgboost to make use of the available cores in model building, so I fed nthread = 1
to xgboost via mlr to ensure that does not happen. Nonetheless, my code seems to run much slower on my larger linux server than it does on my smaller windows computer -- any thoughts as to what might be happening?
Thanks so very much.
xgb_learner_tune <- makeLearner(
"classif.xgboost",
predict.type = "response",
par.vals = list(
objective = "binary:logistic",
eval_metric = "map",
nthread=1))
library(parallelMap)
parallelStart(mode="multicore", cpu=8, level="mlr.tuneParams")
tuned_params_trim <- tuneParams(
learner = xgb_learner_tune,
task = trainTask,
resampling = resample_desc,
par.set = xgb_params,
control = control,
measures = list(ppv, tpr, tnr, mmce)
)
parallelStop()
Edit
I am still surprised by my lack of performance improvement attempting to parallelize at the tuning level. Are my expectations unfair? I am getting substantially slower performance with parallelMap
than tuning in serial for the below process:
numeric_ps = makeParamSet(
makeNumericParam("C", lower = 0.5, upper = 2.0),
makeNumericParam("sigma", lower = 0.5, upper = 2.0)
)
ctrl = makeTuneControlRandom(maxit=1024L)
rdesc = makeResampleDesc("CV", iters = 3L)
#In serial
start.time.serial <- Sys.time()
res.serial = tuneParams("classif.ksvm", task = iris.task, resampling = rdesc,
par.set = numeric_ps, control = ctrl)
stop.time.serial <- Sys.time()
stop.time.serial - start.time.serial
#In parallel with 2 CPUs
start.time.parallel.2 <- Sys.time()
parallelStart(mode="multicore", cpu=2, level="mlr.tuneParams")
res.parallel.2 = tuneParams("classif.ksvm", task = iris.task, resampling = rdesc,
par.set = numeric_ps, control = ctrl)
parallelStop()
stop.time.parallel.2 <- Sys.time()
stop.time.parallel.2 - start.time.parallel.2
#In parallel with 16 CPUs
start.time.parallel.16 <- Sys.time()
parallelStart(mode="multicore", cpu=16, level="mlr.tuneParams")
res.parallel.16 = tuneParams("classif.ksvm", task = iris.task, resampling = rdesc,
par.set = numeric_ps, control = ctrl)
parallelStop()
stop.time.parallel.16 <- Sys.time()
stop.time.parallel.16 - start.time.parallel.16
My console output is (tuning details omitted):
> stop.time.serial - start.time.serial
Time difference of 33.0646 secs
> stop.time.parallel - start.time.parallel
Time difference of 2.49616 mins
> stop.time.parallel.16 - start.time.parallel.16
Time difference of 2.533662 mins
I would have expected things to be faster in parallel. Is that unreasonable for this example? If so, when should I expect performance improvements in parallel?
Looking at the terminal, I do seem to be using 2 (and 16) threads/processes (apologies if my terminology is incorrect).
Thanks so much for any further input.
回答1:
This question is more about guessing whats wrong in your setup than actually providing a "real" answer. Maybe you could also change the title as you did not get "unexpected results".
Some points:
nthread = 1
is already the default forxgboost
inmlr
multicore
is the preferred mode on UNIX systems- If your local machine is faster than your server, than either your calculations finish very quickly and the CPU freq between both is substantially different or you should think about parallelizing another level than
mlr.tuneParams
(see here for more information)
Edit
Everythings fine on my machine. Looks like a local problem on your side.
library(mlr)
#> Loading required package: ParamHelpers
#> Registered S3 methods overwritten by 'ggplot2':
#> method from
#> [.quosures rlang
#> c.quosures rlang
#> print.quosures rlang
library(parallelMap)
numeric_ps = makeParamSet(
makeNumericParam("C", lower = 0.5, upper = 2.0),
makeNumericParam("sigma", lower = 0.5, upper = 2.0)
)
ctrl = makeTuneControlRandom(maxit=1024L)
rdesc = makeResampleDesc("CV", iters = 3L)
#In serial
start.time.serial <- Sys.time()
res.serial = tuneParams("classif.ksvm", task = iris.task, resampling = rdesc,
par.set = numeric_ps, control = ctrl)
#> [Tune] Started tuning learner classif.ksvm for parameter set:
#> Type len Def Constr Req Tunable Trafo
#> C numeric - - 0.5 to 2 - TRUE -
#> sigma numeric - - 0.5 to 2 - TRUE -
#> With control class: TuneControlRandom
#> Imputation value: 1
stop.time.serial <- Sys.time()
stop.time.serial - start.time.serial
#> Time difference of 31.28781 secs
#In parallel with 2 CPUs
start.time.parallel.2 <- Sys.time()
parallelStart(mode="multicore", cpu=2, level="mlr.tuneParams")
#> Starting parallelization in mode=multicore with cpus=2.
res.parallel.2 = tuneParams("classif.ksvm", task = iris.task, resampling = rdesc,
par.set = numeric_ps, control = ctrl)
#> [Tune] Started tuning learner classif.ksvm for parameter set:
#> Type len Def Constr Req Tunable Trafo
#> C numeric - - 0.5 to 2 - TRUE -
#> sigma numeric - - 0.5 to 2 - TRUE -
#> With control class: TuneControlRandom
#> Imputation value: 1
#> Mapping in parallel: mode = multicore; level = mlr.tuneParams; cpus = 2; elements = 1024.
#> [Tune] Result: C=1.12; sigma=0.647 : mmce.test.mean=0.0466667
parallelStop()
#> Stopped parallelization. All cleaned up.
stop.time.parallel.2 <- Sys.time()
stop.time.parallel.2 - start.time.parallel.2
#> Time difference of 16.13145 secs
#In parallel with 4 CPUs
start.time.parallel.16 <- Sys.time()
parallelStart(mode="multicore", cpu=4, level="mlr.tuneParams")
#> Starting parallelization in mode=multicore with cpus=4.
res.parallel.16 = tuneParams("classif.ksvm", task = iris.task, resampling = rdesc,
par.set = numeric_ps, control = ctrl)
#> [Tune] Started tuning learner classif.ksvm for parameter set:
#> Type len Def Constr Req Tunable Trafo
#> C numeric - - 0.5 to 2 - TRUE -
#> sigma numeric - - 0.5 to 2 - TRUE -
#> With control class: TuneControlRandom
#> Imputation value: 1
#> Mapping in parallel: mode = multicore; level = mlr.tuneParams; cpus = 4; elements = 1024.
#> [Tune] Result: C=0.564; sigma=0.5 : mmce.test.mean=0.0333333
parallelStop()
#> Stopped parallelization. All cleaned up.
stop.time.parallel.16 <- Sys.time()
stop.time.parallel.16 - start.time.parallel.16
#> Time difference of 10.14408 secs
Created on 2019-06-14 by the reprex package (v0.3.0)
来源:https://stackoverflow.com/questions/55978153/r-how-to-use-parallelmap-with-mlr-xgboost-on-linux-server-unexpected-perfor