Caret package - glmnet variable importance

爷,独闯天下 提交于 2020-02-24 11:30:15

问题


I am using the glmnet package to perform a LASSO regression. I am now working on feature importance using the caret package. What I don't understand is the value of the importance. Could anyone enlighten me? Is there any formula to calculate these values or does that mean that these values are based on the beta values?

ROC curve variable importance
  only 7 most important variables shown (out of 25)
                                            Importance
feature1                             0.8974
feature2                             0.8962
feature3                              0.8957
feature4                              0.8744
feature5                              0.8701
feature6                              0.8658
feature7                             0.8253

回答1:


caret actually looks at the final coefficients of the fit and then takes the absolute value to rank the coefficients. Then the ranked coefficients are stored as variable importance.

To view the source code, you can type

getModelInfo("glmnet")$glmnet$varImp

To summarize, these are the lines to calculate it:

function(object, lambda = NULL, ...) {

  ## skipping a few lines

  beta <- predict(object, s = lambda, type = "coef")
  if(is.list(beta)) {
    out <- do.call("cbind", lapply(beta, function(x) x[,1]))
    out <- as.data.frame(out)
  } else out <- data.frame(Overall = beta[,1])
  out <- abs(out[rownames(out) != "(Intercept)",,drop = FALSE])
  out
}


来源:https://stackoverflow.com/questions/37540837/caret-package-glmnet-variable-importance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!