问题
The CRAN implementation of random forests offers both variable importance measures: the Gini importance as well as the widely used permutation importance defined as
For classification, it is the increase in percent of times a case is OOB and misclassified when the variable is permuted. For regression, it is the average increase in squared OOB residuals when the variable is permuted
By default h2o.varimp() computes only the former. Is there really no option in h2o to get the alternative measure out of a random forest model?
Thanks! ML
回答1:
H2O does not calculate permutation importance. Please see the documentation for the explanation of how variable importance is calculated.
For your convenience I'll paste it as well below:
How is variable importance calculated for DRF?
Variable importance is determined by calculating the relative influence of each variable: whether that variable was selected during splitting in the tree building process and how much the squared error (over all trees) improved as a result.
A feature request has been previously made for this issue, you can follow it here (though note it is currently open).
来源:https://stackoverflow.com/questions/51584970/permutation-importance-in-h2o-random-forest