问题
Does predictInterval()
from the merTools
package not like nested random effects? For example, using the msleep
dataset from the ggplot2
package:
library("lme4")
library("merTools")
library("ggplot2")
mod <- lmer(sleep_total ~ bodywt + (1|vore/order), data=msleep)
predInt <- predictInterval(merMod=mod, newdata=msleep)
Returns an error:
Error in '[.data.frame'(newdata, , j) : undefined columns selected
This runs fine no problem:
mod <- lmer(sleep_total ~ bodywt + (1|vore) + (1|order), data=msleep)
predInt <- predictInterval(merMod=mod, newdata=msleep)
(Well actually it gives a warning about NA
levels in the random effect variables, but I'm not concerned about that)
UPDATE
As discussed in the comments of Ben Bolker's answer below, a new version of merTools
accounts for nested random effects. However, when I try to predict for data that contains new levels of the nested random effect, I get errors.
This works:
mod <- lmer(sleep_total ~ bodywt + (1|vore/order), data=msleep)
predInt <- predictInterval(merMod=mod, newdata=msleep)
And this works, albeit with a couple of warnings (see below for additional Q about the warnings*):
mod <- lmer(sleep_total ~ bodywt + (1|vore) + (1|order), data=msleep)
msleep2 <- msleep %>% mutate(vore = "omni")
predInt <- predictInterval(merMod=mod, newdata=msleep2)
But this does not work:
mod <- lmer(sleep_total ~ bodywt + (1|vore/order), data=msleep)
msleep2 <- msleep %>% mutate(vore = "omni")
predInt <- predictInterval(merMod=mod, newdata=msleep2)
With the following error:
Error in `[.data.frame`(tmp, alllvl) : undefined columns selected
In addition: Warning message:
In predictInterval(merMod = mod, newdata = msleep3) :
newdata is tbl_df or tbl object from dplyr package and has been
coerced to a data.frame
And here, "omni"
isn't actually a new level of vore
, but when combined with order
, it creates new nested combinations of the variables.
If I use "new"
or anything else that isn't an observed level of vore
, I get similar results: It works for the non-nested version of the model, but not for the nested version.
*Also, should I be concerned about the warning given by the second model chunk above:
> mod <- lmer(sleep_total ~ bodywt + (1|vore) + (1|order), data=msleep)
> msleep2 <- msleep %>% mutate(vore = "omni")
> predInt <- predictInterval(merMod=mod, newdata=msleep2)
Warning messages:
1: In predictInterval(merMod = mod, newdata = msleep2) :
newdata is tbl_df or tbl object from dplyr package and has been
coerced to a data.frame
2: In chol.default(sigma, pivot = TRUE) :
the matrix is either rank-deficient or indefinite
I'm going to guess the second one is a result of vore
taking on the same value for each observation, but that shouldn't be an issue for prediction, should it? I could see it being an issue if the variable took on the same value when I was fitting the model, but don't think it should be an issue when predicting new observations?
回答1:
One can (apparently) work around this by writing out the interaction term explicitly. Warning: I haven't actually checked to make sure the resulting predictions are correct, just seen that no error is produced and the resulting object is approximately sensible ...
msleep <- transform(msleep,voreOrder=interaction(vore,order,drop=TRUE))
mod2 <- lmer(sleep_total ~ bodywt + (1|vore)+(1|voreOrder), data=msleep)
predInt <- predictInterval(merMod=mod2, newdata=msleep)
This does generate warning messages, but apparently they're due to <NA>
values in the vore
variable (I don't know this data set ...)
来源:https://stackoverflow.com/questions/35166173/mertools-predictinterval-for-model-with-nested-random-effect