What do you need to watch out for when using cross-validation with GLM lambda search?

情到浓时终转凉″ 提交于 2019-12-04 06:59:01

问题


Regarding h2o.glm lambda search not appearing to iterate over all lambdas, I read the question as complaining that lambda was too high; they tried setting early_stopping=F in the hope that might fix that "bug".

Isn't it the case that the original behaviour was a feature, not a bug? And if that is correct, then you should always use early_stopping=T when using cross-validation with GLM, otherwise the error estimate from cross-validation is useless; you also risk over-fitting.

(My main question is if my understanding of the way GLM and CV work together is correct; but I'd be interested if there are any other things to watch out for when using lambda_search and cross-validation together.)


回答1:


H2O's glm with lambda search and cross-validation should always pick the best lambda based on cross-validation and use that in the returned (main) model. The early stopping option should have no effect on selected lambda. Its purpose is to skip computation of models for lambdas > best since they are not needed for the main model (we still compute models for lambdas < best since that allows to use warm starting and take full advantage of strong rules).

I think the behavior with early_stopping set to false should compute models for all lambdas in case user wants to see them / do custom model selection.



来源:https://stackoverflow.com/questions/45948642/what-do-you-need-to-watch-out-for-when-using-cross-validation-with-glm-lambda-se

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!