Increase h2o.init timeout

问题

How can I increase the h2o startup timeout when starting an h2o server via R? I have a multinode AWS EC2 cluster, where I start a separate h2o server on each node. After startup, some EC2 nodes can be a bit slow and I'd rather increase the timeout than to re-run the h2o initialization code on these nodes.

What I am currently doing is along the lines of

library(doParallel)
library(foreach)

workers=parallel::makePSOCKcluster(workerIPs,master=masterIP)
registerDoParallel(workers)

foreach(i=seq_along(workers),.inorder=FALSE,.multicombine=TRUE) %dopar% {
  library(h2o)
  h2o.init(nthreads=-1)
  paste0(capture.output(h2o.clusterStatus()),collapse="\n")
}

Slow nodes will throw an error at h2o.clusterStatus() if h2o.init(nthreads=-1) produced a timeout.

BTW: I am using h2o v 3.10.4.4 and I am on ubuntu 16.04.

回答1:

So, I looked at the h2o source code on github and it does not seem as if there is a timeout argument (neither in R nor in the underlying java code). There is a java argument called session_timeout but I don't think this applies to my problem.

So what I did is this:

foreach(i=seq_along(workers),.inorder=FALSE,.multicombine=TRUE) %dopar% {
  library(h2o)
  startCounter=1
  startCounterMax=3
  while(inherits(clusterStatus<-try({
      h2o.init(nthreads=-1)
      capture.output(h2o.clusterStatus())
    },silent=TRUE),"try-error")&(startCounter<=startCounterMax)) {
    startCounter=startCounter+1
  }
  if (startCounter>startCounterMax) stop("Failed to start h2o server for ",
                                         startCounterMax," successive times")

  return(clusterStatus)
}

Not very nice but it does the job.

回答2:

If you are trying to form a cluster of several H2O nodes (say cluster of 3 h2o nodes with one node per machine) and you want to wait for a specified time then you can try it in Java code - water.H2O.waitForCloudSize(3, 50 * 1000/*ms*/); I assume there should be the corresponding parameter available in R as well.

来源：https://stackoverflow.com/questions/43515062/increase-h2o-init-timeout

标签

h2o