glmmLasso error and warning

亡梦爱人 提交于 2020-01-04 07:55:04

问题


I am trying to perform variable selection in a generalized linear mixed model using glmmLasso, but am coming up with an error and a warning, that I can not resolve. The dataset is unbalanced, with some participants (PTNO) having more samples than others; no missing data. My dependent variable is binary, all other variables (beside the ID variable PTNO) are continous. I suspect something very generic is happening, but obviously fail to see it and have not found any solution in the documentation or on the web. The code, which is basically just adapted from the glmmLasso soccer example is:

glm8 <- glmmLasso(Group~NDUFV2_dCTABL+GPER1_dCTABL+ ESR1_dCTABL+ESR2_dCTABL+KLF12_dCTABL+SP4_dCTABL+SP1_dCTABL+  PGAM1_dCTABL+ANK3_dCTABL+RASGRP1_dCTABL+AKT1_dCTABL+NUDT1_dCTABL+                   POLG_dCTABL+   ADARB1_dCTABL+OGG_dCTABL+ PDE4B_dCTABL+  GSK3B_dCTABL+ APOE_dCTABL+  MAPK6_dCTABL, rnd = list(PTNO=~1),  
    family = poisson(link = log), data = stackdata, lambda=100, 
    control = list(print.iter=TRUE,start=c(1,rep(0,29)),q.start=0.7)) 

The error message is displayed below. Specficially, I do not believe there are any NAs in the dataset and I am unsure about the meaning of the warning regarding the factor variable.

Iteration 1 Error in grad.lasso[b.is.0] <- score.beta[b.is.0] - lambda.b * sign(score.beta[b.is.0]) : NAs are not allowed in subscripted assignments In addition: Warning message: In Ops.factor(y, Mu) : ‘-’ not meaningful for factors

An abbreviated dataset containing the necessary variables is available in R format and can be downladed here. I hope I can be guided a bit as to how to go on with the analysis. Please let me know if there is anything wrong with the dataset or you cannot download it. ANY help is much appreciated.


回答1:


Just to follow up on @Kristofersen comment above. It is indeed the start vector that messes your analysis up.

If I run

glm8 <- glmmLasso(Group~NDUFV2_dCTABL+GPER1_dCTABL+ ESR1_dCTABL+ESR2_dCTABL+KLF12_dCTABL+SP4_dCTABL+SP1_dCTABL+  PGAM1_dCTABL+ANK3_dCTABL+RASGRP1_dCTABL+AKT1_dCTABL+NUDT1_dCTABL+                   POLG_dCTABL+   ADARB1_dCTABL+OGG_dCTABL+ PDE4B_dCTABL+  GSK3B_dCTABL+ APOE_dCTABL+  MAPK6_dCTABL, 
                  rnd = list(PTNO=~1), 
                  family = binomial(), 
                  data = stackdata, 
                  lambda=100,     
                  control = list(print.iter=TRUE))

then everything is fine and dandy (i.e., it converges and produces a solution). You have copied the example with poisson regression and you need to tweak the code to your situation. I have no idea about whether the output makes sense.

Quick note: I ran with the binomial distribution in the code above since your outcome is binary. If it makes sense to estimate relative risks then poisson may be reasonable (and it also converges), but you need to recode your outcome as the two groups are defined as 1 and 2 and that will certainly mess up the poisson regression.

In other words do a

stackdata$Group <- stackdata$Group-1

before you run the analysis.



来源:https://stackoverflow.com/questions/40484977/glmmlasso-error-and-warning

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!