permutation importance in h2o random Forest

混江龙づ霸主 提交于 2020-01-06 05:47:08

问题


The CRAN implementation of random forests offers both variable importance measures: the Gini importance as well as the widely used permutation importance defined as

For classification, it is the increase in percent of times a case is OOB and misclassified when the variable is permuted. For regression, it is the average increase in squared OOB residuals when the variable is permuted

By default h2o.varimp() computes only the former. Is there really no option in h2o to get the alternative measure out of a random forest model?

Thanks! ML


回答1:


H2O does not calculate permutation importance. Please see the documentation for the explanation of how variable importance is calculated.

For your convenience I'll paste it as well below:

How is variable importance calculated for DRF?

Variable importance is determined by calculating the relative influence of each variable: whether that variable was selected during splitting in the tree building process and how much the squared error (over all trees) improved as a result.

A feature request has been previously made for this issue, you can follow it here (though note it is currently open).



来源:https://stackoverflow.com/questions/51584970/permutation-importance-in-h2o-random-forest

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!