How to find the validation error as a function of the number of epochs on a fine scale using h2o.grid in R

问题

I have a very noisy dataset with 2000 observations and 42 features (financial data) and I'm performing binary classification. Here I'm tuning the network using h2o.grid and providing a validation set. I've set epochs=1000 and I'm imposing to stop the training when the misclassification error does not improve by >=1% for 5 scoring events (stopping_rounds=5, stopping_tolerance=0.01). I'm interested to know what is the value for epochs that minimises the validation error.

hyper_params = list(rho = c(0.9,0.95,0.99),
                 epsilon = 10^(c(-10, -8, -6, -4)),
                 hidden=list(c(64, 64)),
                 activation=c("Tanh", "Rectifier", "RectifierWithDropout"))
grid = h2o.grid("deeplearning", x = predictors, y = response,
                training_frame = tempTrain, validation_frame = tempValid,
                grid_id="h2oGrid10", hyper_params = hyper_params,
                adaptive_rate = TRUE, stopping_metric="misclassification",
                variable_importances = TRUE, epochs = 1000,
                stopping_rounds=5, stopping_tolerance=0.01, max_w2 = 20)

According to this question, the solution should be the following:

gridErr = h2o.getGrid("h2oGrid10", sort_by="err", decreasing=FALSE)
best_model = h2o.getModel(gridErr@model_ids[[1]])
solution = rev(best_model@model$scoring_history$epochs)[1]

Where solution=1000. Anyway, checking the scoring_history we observe the following output that is quite ambiguous.

cbind(best_model@model$scoring_history$epochs,
+       best_model@model$scoring_history$validation_classification_error)
      [,1]      [,2]
 [1,]    0       NaN
 [2,]   10 0.4971347
 [3,]  160 0.4813754
 [4,]  320 0.4770774
 [5,]  490 0.4799427
 [6,]  660 0.4727794
 [7,]  840 0.4713467
 [8,] 1000 0.4727794
 [9,] 1000 0.4713467

In fact, it seems that the global minimum of the validation error is in correspondence of 840 epochs AND 1000 epochs. I've tried with different settings and I still get that the optimal number of epochs corresponds to the initially set number of epochs. Furthermore, I'm quite surprise to observe a so large number of optimal epochs given the conservative values for stopping_rounds=5 and stopping_tolerance=0.01 so I'm wondering whether I'm missing something important. How do I retrieve the optimal number of epochs, possibly in a finer scale (i.e. 1,2,... rather than 10,160,...)?

EDIT: The answer is in slide 8 here. What happens is that the best model is overwritten when performing the last iteration. Anyway, I've played for a while with the parameter train_samples_per_iteration but I'm not still able to observe the evolution of the validation error with the number of epochs in a finer scale. Any idea?

来源：https://stackoverflow.com/questions/39207125/how-to-find-the-validation-error-as-a-function-of-the-number-of-epochs-on-a-fine

标签

neural-network

deep-learning

h2o