warning messages when trying to run glmer in r

前端 未结 1 1710
情话喂你
情话喂你 2021-01-31 06:33

Dear Stack Overflow community,

Currently I\'m trying to rerun an old data analysis, binomial glmer model, (from early 2013) on the latest version of R and lme4, because

相关标签:
1条回答
  • 2021-01-31 06:52

    tl;dr at least based on the subset of data you provided, this is a fairly unstable fit. The warnings about near unidentifiability go away if you scale the continuous predictors. Trying with a wide variety of optimizers, we get about the same log-likelihoods, and parameter estimates that vary by a few percent; two optimizers (nlminb from base R and BOBYQA from the nloptr package) converge without warnings, and are probably giving the "correct" answer. I haven't computed confidence intervals, but I suspect that they're very wide. (Your mileage may differ somewhat with your full data set ...)

    source("SO_23478792_dat.R")  ## I put the data you provided in here
    

    Basic fit (replicated from above):

    library(lme4)
    df$SUR.ID <- factor(df$SUR.ID)
    df$replicate <- factor(df$replicate)
    Rdet <- cbind(df$ValidDetections,df$FalseDetections)
    Unit <- factor(1:length(df$ValidDetections))
    m1 <- glmer(Rdet ~ tm:Area + tm:c.distance +
                c.distance:Area + c.tm.depth:Area +
                c.receiver.depth:Area + c.temp:Area +
                c.wind:Area +
                c.tm.depth + c.receiver.depth +
                c.temp +c.wind + tm + c.distance + Area +
                replicate +
                (1|SUR.ID) + (1|Day) + (1|Unit) ,
                data = df, family = binomial(link=logit))
    

    I get more or less the same warnings you did, slightly fewer since the development version has been a little improved/tweaked:

    ## 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
    ##   Model failed to converge with max|grad| = 1.52673 (tol = 0.001, component 1)
    ## 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
    ##   Model is nearly unidentifiable: very large eigenvalue
    ##  - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
    ## - Rescale variables?
    

    I tried various little things (restarting from the previous fitted values, switching optimizers) without much change in the results (i.e. the same warnings).

    ss <- getME(m1,c("theta","fixef"))
    m2 <- update(m1,start=ss,control=glmerControl(optCtrl=list(maxfun=2e4)))
    m3 <- update(m1,start=ss,control=glmerControl(optimizer="bobyqa",
                             optCtrl=list(maxfun=2e4)))
    

    Following the advice in the warning message (rescaling the continuous predictors):

    numcols <- grep("^c\\.",names(df))
    dfs <- df
    dfs[,numcols] <- scale(dfs[,numcols])
    m4 <- update(m1,data=dfs)
    

    This gets rid of scaling warnings, but the warning about large gradients persists.

    Use some utility code to fit the same model with many different optimizers:

    afurl <- "https://raw.githubusercontent.com/lme4/lme4/master/misc/issues/allFit.R"
    ## http://tonybreyal.wordpress.com/2011/11/24/source_https-sourcing-an-r-script-from-github/
    library(RCurl)
    eval(parse(text=getURL(afurl)))
    aa <- allFit(m4)
    is.OK <- sapply(aa,is,"merMod")  ## nlopt NELDERMEAD failed, others succeeded
    ## extract just the successful ones
    aa.OK <- aa[is.OK]
    

    Pull out warnings:

    lapply(aa.OK,function(x) x@optinfo$conv$lme4$messages)
    

    (All but nlminb and nloptr BOBYQA give convergence warnings.)

    Log-likelihoods are all approximately the same:

    summary(sapply(aa.OK,logLik),digits=6)
    ##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
    ## -107.127 -107.114 -107.111 -107.114 -107.110 -107.110 
    

    (again, nlminb and nloptr BOBYQA have the best fits/highest log-likelihoods)

    Compare fixed effect parameters across optimizers:

    aa.fixef <- t(sapply(aa.OK,fixef))
    library(ggplot2)
    library(reshape2)
    library(plyr)
    aa.fixef.m <- melt(aa.fixef)
    models <- levels(aa.fixef.m$Var1)
    (gplot1 <- ggplot(aa.fixef.m,aes(x=value,y=Var1,colour=Var1))+geom_point()+
        facet_wrap(~Var2,scale="free")+
        scale_y_discrete(breaks=models,
                         labels=abbreviate(models,6)))
    ## coefficients of variation of fixed-effect parameter estimates:
    summary(unlist(daply(aa.fixef.m,"Var2",summarise,sd(value)/abs(mean(value)))))
    ##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
    ## 0.003573 0.013300 0.022730 0.019710 0.026200 0.035810 
    

    Compare variance estimates (not as interesting: all optimizers except N-M give exactly zero variance for Day and SUR.ID)

    aa.varcorr <- t(sapply(aa.OK,function(x) unlist(VarCorr(x))))
    aa.varcorr.m <- melt(aa.varcorr)
    gplot1 %+% aa.varcorr.m
    

    I tried to run this with lme4.0 ("old lme4"), but got various "Downdated VtV" errors, even with the scaled data set. Perhaps that problem would go away with the full data set?

    I haven't yet explored why drop1 doesn't work properly if the initial fit returns warnings ...

    0 讨论(0)
提交回复
热议问题