I am using glmnet package to get following graph from mtcars dataset (regression of mpg on other variables):
library(glmnet)
fit = glmnet(as.matrix(mtcars[-1
Here is a modification of the best answer, using line segments instead of text labels directly overlying the curves. This is especially useful when there are lots of variables and you only want to print those that had absolute coefficient values greater than zero:
#note: the argument 'lra' is a cv.glmnet object
lbs_fun <- function(lra, ...) {
fit <- lra$glmnet.fit
L=which(fit$lambda==lra$lambda.min)
ystart <- sort(fit$beta[abs(fit$beta[,L])>0,L])
labs <- names(ystart)
r <- range(fit$beta[,100]) # max gap between biggest and smallest coefs at smallest lambda i.e., 100th lambda
yfin <- seq(r[1],r[2],length=length(ystart))
xstart<- log(lra$lambda.min)
xfin <- xstart+1
text(xfin+0.3,yfin,labels=labs,...)
segments(xstart,ystart,xfin,yfin)
}
plot(lra$glmnet.fit,label=F, xvar="lambda", xlim=c(-5.2,0), lwd=2) #xlim, lwd is optional
An alternative is the plot_glmnet function in the plotmo package. It automatically positions the variable names and has a few other bells and whistles. For example, the following code
library(glmnet)
mod <- glmnet(as.matrix(mtcars[-1]), mtcars[,1])
library(plotmo) # for plot_glmnet
plot_glmnet(mod)
gives
The variable names are spread out to prevent overplotting, but we can still make out which curve is associated with which variable. Further examples may be found in Chapter 6 in plotres vignette which is included in the plotmo package.
As the labels are hard coded it is perhaps easier to write a quick function. This is just a quick shot, so can be changed to be more thorough. I would also note that when using the lasso there are normally a lot of variables so there will be a lot of overlap of the labels (as seen in your small example)
lbs_fun <- function(fit, ...) {
L <- length(fit$lambda)
x <- log(fit$lambda[L])
y <- fit$beta[, L]
labs <- names(y)
text(x, y, labels=labs, ...)
}
# plot
plot(fit, xvar="lambda")
# label
lbs_fun(fit)