How does glmnet compute the maximal lambda value?

前端 未结 4 1168
迷失自我
迷失自我 2020-12-29 14:15

The glmnet package uses a range of LASSO tuning parameters lambda scaled from the maximal lambda_max under which no predi

相关标签:
4条回答
  • 2020-12-29 14:29

    It seems lambda_max for a logistic regression is calculated similarly, with weights based on class proportions:

    set.seed(1)
    library("glmnet")
    x <- matrix(rnorm(100*20),100,20)
    y <- rnorm(100)
    
    mysd <- function(y) sqrt(sum((y-mean(y))^2)/length(y))
    sx <- scale(x, scale=apply(x, 2, mysd))
    sx <- as.matrix(sx, ncol=20, nrow=100)
    
    y_bin <- factor(ifelse(y<0, -1, 1))
    prop.table(table(y_bin)) 
    # y_bin
    #   -1    1 
    # 0.62 0.38 
    fitGLM_log <- glmnet(sx, y_bin, family = "binomial")
    max(fitGLM_log$lambda)
    # [1] 0.1214006
    max(abs(colSums(sx*ifelse(y<0, -.38, .62))))/100
    # [1] 0.1214006
    
    0 讨论(0)
  • 2020-12-29 14:37

    For your second question, look to Friedman et al's paper, "Regularization paths for generalized linear models via coordinate descent". In particular, see equation (10), which is equality at equilibrium. Just check under what conditions the numerator $S(\cdot,\cdot)$ is zero for all parameters.

    0 讨论(0)
  • 2020-12-29 14:43

    According to help("glmnet") the maximal lambda value is "the smallest value for which all coefficients are zero":

    sum(fitGLM$beta[, which.max(fitGLM$lambda)])
    #[1] 0
    sum(glmnet(x,y, lambda=max(fitGLM$lambda)*0.999)$beta)
    #[1] -0.0001809804
    

    At a quick glance the value seems to be calculated by the Fortran code called by elnet.

    0 讨论(0)
  • 2020-12-29 14:47

    To get the same result you need to standardize the variables using a standard deviation with n instead of n-1 denominator.

    mysd <- function(y) sqrt(sum((y-mean(y))^2)/length(y))
    sx <- scale(x,scale=apply(x, 2, mysd))
    sx <- as.matrix(sx, ncol=20, nrow=100)
    sy <- as.vector(scale(y, scale=mysd(y)))
    max(abs(colSums(sx*sy)))/100
    ## [1] 0.1758808
    fitGLM <- glmnet(sx,sy)
    max(fitGLM$lambda)
    ## [1] 0.1758808
    
    0 讨论(0)
提交回复
热议问题