How to tune hidden_dropout_ratios in h2o.grid in R

。_饼干妹妹 提交于 2019-12-05 16:36:41

You have to fix the number of hidden layers in a grid, if experimenting with hidden_dropout_ratios. At first I messed around with combining multiple grids; then, when researching for my H2O book, I saw someone mention, in passing, how grids get combined automatically if you give them the same name.

So, you still need to call h2o.grid() for each number of hidden layers, but they can all be in the same grid at the end. Here is your example modified for that:

require(h2o)
h2o.init()
data(iris)
iris = iris[sample(1:nrow(iris)), ]
irisTrain = as.h2o(iris[1:90, ])
irisValid = as.h2o(iris[91:120, ])
irisTest = as.h2o(iris[121:150, ])

hyper_params1 <- list(
    input_dropout_ratio = c(0, 0.15, 0.3),
    hidden_dropout_ratios = list(0, 0.15, 0.3),
    hidden = list(64)
    )

hyper_params2 <- list(
    input_dropout_ratio = c(0, 0.15, 0.3),
    hidden_dropout_ratios = list(c(0,0), c(0.15,0.15),c(0.3,0.3)),
    hidden = list(c(32,32))
    )

grid = h2o.grid("deeplearning", x=colnames(iris)[1:4], y=colnames(iris)[5],
    grid_id = "stackoverflow",
    training_frame = irisTrain, validation_frame = irisValid,
    hyper_params = hyper_params1, adaptive_rate = TRUE,
    variable_importances = TRUE, epochs = 50, stopping_rounds=5,
    stopping_tolerance=0.01, activation=c("RectifierWithDropout"),
    seed=1, reproducible=TRUE)

grid = h2o.grid("deeplearning", x=colnames(iris)[1:4], y=colnames(iris)[5],
    grid_id = "stackoverflow",
    training_frame = irisTrain, validation_frame = irisValid,
    hyper_params = hyper_params2, adaptive_rate = TRUE,
    variable_importances = TRUE, epochs = 50, stopping_rounds=5,
    stopping_tolerance=0.01, activation=c("RectifierWithDropout"),
    seed=1, reproducible=TRUE)

When I went to print the grid, I was reminded there is a bug with grid output when using list hyper-parameters, such as hidden or hidden_dropout_ratios. Your code is a nice self-contained example, so I'll report that now. In the meantime, here is a one-liner to show the values of the hyper-parameter corresponding to each:

sapply(models, function(m) c(
  paste(m@parameters$hidden, collapse = ","),
  paste(m@parameters$hidden_dropout_ratios, collapse=",")
  ))

Which gives:

     [,1]    [,2] [,3]        [,4]   [,5]      [,6] 
[1,] "32,32" "64" "32,32"     "64"   "32,32"   "64" 
[2,] "0,0"   "0"  "0.15,0.15" "0.15" "0.3,0.3" "0.3"

I.e. no hidden dropout is better than a little, which is better than a lot. And two hidden layers is better than one.

By the way,

  • input_dropout_ratio: controls dropout between input layer and the first hidden layer. Can be used independently of the activation function.
  • hidden_dropout_ratios: controls dropout between each hidden layer and the next layer (which is either the next hidden layer, or the output layer). If specified, you must specify one of the "WithDropout" activation functions.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!