问题
Ran into this problem while trying to get the empirical distribution of the K-R degrees of freedom...
This seems like fairly dangerous behaviour? Does it constitute a bug?
Reproducible example:
## import lmerTest package
library(lmerTest)
## an object of class merModLmerTest
m <- lmer(Informed.liking ~ Gender+Information+Product +(1|Consumer), data=ham)
# simulate data from fitted model
simData=ham
simData$Informed.liking=unlist(simulate(m))
# fit model to simulated data
m1 <- lmer(Informed.liking ~ Gender+Information+Product +(1|Consumer), data=simData)
stats:::anova(m1)
lmerTest:::anova(m1)
# simulate again, WITHOUT refitting
simData$Informed.liking=unlist(simulate(m))
stats:::anova(m1) # same as before
lmerTest:::anova(m1) # not same as before!
回答1:
my response does not constitute a solid answer, rather an extended comment:
this looks pretty bad - in fact I have discovered today that almost all the analyses that I conducted in a project that was on the verge of submission have to be redone because of a related behavior of lmerTest
.
The problem I have run into was when I used a short function that fits a model with lmer
and then returns coef(summary(model))
- simple stuff, two lines of code. However the input to this function was named data
and I also had a dataframe called data
in the workspace. It seems that although during fitting with lmer
the local variable from the function scope was correctly used, during summary
the workspace data
variable was used (which often was not the same as the dataframe passed to the function) leading to invalid t values and degrees of freedom leading to incorrect p values (the estimates and their standard error was ok however).
So, answering your question:
This seems like fairly dangerous behaviour? Does it constitute a bug?
It seems dangerous indeed and I would definitelly call this a bug.
来源:https://stackoverflow.com/questions/42805643/lmertestanova-uses-lazy-loading-of-data-sets