问题
I use lme
function in the nlme
R package to test if levels of factor items
has significant interaction with levels of factor condition
. The factor condition
has two levels: Control
and Treatment
, and the factor items
has 3 levels: E1,...,E3
. I use the following code:
f.lme = lme(response ~ 0 + factor(condition) * factor(items), random = ~1|subject)
where subject
is the random effect. In this way, when I run:
summary(f.lme)$tTable
I will get the following output:
factor(condition)Control
factor(condition)Treatment
factor(items)E2
factor(items)E3
factor(condition)Treatment:factor(items)E2
factor(condition)Treatment:factor(items)E3
together with Value, Std.Error, DF, t-value, p-value
columns. I have two questions:
If I want to compare
Control
vs.Treatment
, shall I just useestimable()
function ingmodels
and make a contrast of(-1,1,0,0,0,0)
?I am interested in whether levels of
items
, i.e.E1, E2, E3
are different acrosscondition
, so I am interested in whether the interaction terms are significant (by just checking thep-value
column??):factor(condition)Treatment:factor(items)E2 factor(condition)Treatment:factor(items)E3
However, how can I tell if factor(condition)Treatment:factor(items)E1
is significant or not? It is not shown in the summary output and I think it has something to do with the contrast used in R... Thanks a lot!
回答1:
I respectfully disagree with @sven-hohenstein
In R, the default coding for categorial variables is treatment contrast coding. In treatment contrasts, the first level is the reference level. All remaining factor levels are compared with the reference level.
First, the fixed effects are specified here with a zero intercept, ... ~ 0 + ...
. This means that the condition
coding is no longer contr.treatment
. If I'm not mistaken, the main effects of Control
and Treatment
are now interpretable as their respective deviations from the group mean...
In your model, the factor items has three levels: E1, E2, and E3. The two contrasts test the difference between (a) E2 and E1, and (b) E3 and E1. The main effects of these contrasts are estimated for the level Control of the factor condition, since this is the reference category of this factor.
...when the value of items
is at its reference level of E1
! Therefore:
- Main effect
Control
= how muchControl:E1
observations deviate from the mean of itemE1
. - Main effect
Treatment
= how muchTreatment:E1
observations deviate from the mean of itemE1
. - Main effect
E2
= how muchControl:E2
observations deviate from the mean of itemE2
. - Main effect
E3
= how muchControl
observations deviate from the mean of itemE3
. - Interaction
Treatment:E2
= how muchTreatment:E2
observations deviate from the mean of itemE2
- Interaction
Treatment:E3
= how muchTreatment:E3
observations deviate from the mean of itemE3
.
Thanks for the pointer to estimable
, I haven't tried it before. For custom contrasts, I've been (ab)using glht from the multcomp
package.
回答2:
You'll need to address your second question about the interaction first. You can certainly set up the likelihood ratio test as in Jan van der Laan's answer. You can also use anova
directly on a fitted lme object. See the help page for anova.lme
for more information.
In terms of interpreting your coefficients, I often find that it sometimes takes making a summary table of the group means in order to appropriately figure out which linear combination of coefficients in the model represents each group. I will show an example with the intercept removal like in your question, although I find this rarely helps me figure out my coefficients once I have two factors in a model. Here is an example of what I mean with the Orthodont dataset (which I decided to make balanced):
require(nlme)
# Make dataset balanced
Orthodont2 = Orthodont[-c(45:64),]
# Factor age
Orthodont2$fage = factor(Orthodont2$age)
# Create a model with an interaction using lme; remove the intercept
fit1 = lme(distance ~ Sex*fage - 1, random = ~1|Subject, data = Orthodont2)
summary(fit1)
Here are the estimated fixed effects. But what do each of these coefficients represent?
Fixed effects: distance ~ Sex * fage - 1
Value Std.Error DF t-value p-value
SexMale 23.636364 0.7108225 20 33.25213 0.0000
SexFemale 21.181818 0.7108225 20 29.79903 0.0000
fage10 0.136364 0.5283622 61 0.25809 0.7972
fage12 2.409091 0.5283622 61 4.55954 0.0000
fage14 3.727273 0.5283622 61 7.05439 0.0000
SexFemale:fage10 0.909091 0.7472171 61 1.21664 0.2284
SexFemale:fage12 -0.500000 0.7472171 61 -0.66915 0.5059
SexFemale:fage14 -0.818182 0.7472171 61 -1.09497 0.2778
Making a summary of group means helps figure this out.
require(plyr)
ddply(Orthodont2, .(Sex, age), summarise, dist = mean(distance) )
Sex fage dist
1 Male 8 23.63636
2 Male 10 23.77273
3 Male 12 26.04545
4 Male 14 27.36364
5 Female 8 21.18182
6 Female 10 22.22727
7 Female 12 23.09091
8 Female 14 24.09091
Notice that the first fixed effect coefficient, called SexMale
, is the mean distance for the age 8 males. The fixed effect SexFemale
is the age 8 female mean distance. Those are the easiest to see (I always start with the easy ones), but the rest aren't too bad to figure out. The mean distance for age 10 males is the first coefficient plus the third coefficient (fage10
). The mean distance for age 10 females is the sum of the coefficients SexFemale
, fage10
, and SexFemale:fage10
. The rest follow along the same lines.
Once you know how to create linear combinations of coefficients for the group means, you can use estimable
to calculate any comparisons of interest. Of course there are a bunch of caveats here about main effects, statistical evidence of an interactions, theoretical reasons to leave interactions in, etc. That is for you to decide. But if I was leaving the interaction in the model (note there is no statistical evidence of an interaction, see anova(fit1)
) and yet wanted to compare the overall mean of Male
to Female
, I would write out the following linear combinations of coefficients:
# male/age group means
male8 = c(1, 0, 0, 0, 0, 0, 0, 0)
male10 = c(1, 0, 1, 0, 0, 0, 0, 0)
male12 = c(1, 0, 0, 1, 0, 0, 0, 0)
male14 = c(1, 0, 0, 0, 1, 0, 0, 0)
# female/age group means
female8 = c(0, 1, 0, 0, 0, 0, 0, 0)
female10 = c(0, 1, 1, 0, 0, 1, 0, 0)
female12 = c(0, 1, 0, 1, 0, 0, 1, 0)
female14 = c(0, 1, 0, 0, 1, 0, 0, 1)
# overall male group mean
male = (male8 + male10 + male12 +male14)/4
# overall female group mean
female = (female8 + female10 + female12 + female14)/4
require(gmodels)
estimable(fit1, rbind(male - female))
You can check your overall group means to make sure you made your linear combinations of coefficients correctly.
ddply(Orthodont2, .(Sex), summarise, dist = mean(distance) )
Sex dist
1 Male 25.20455
2 Female 22.64773
回答3:
The usual way to test if the interaction is significant is to do a likelihood ratio test (e.g. see discussion on R-Sig-ME).
To do that you have to also estimate a model without interaction and you'll also have to use method="ML"
:
f0 = lme(response ~ 0 + factor(condition) * factor(items),
random = ~1|subject, method="ML")
f1 = lme(response ~ 0 + factor(condition) + factor(items),
random = ~1|subject, method="ML")
You can then compare using anova
:
anova(f0, f1)
Also see this blog post
来源:https://stackoverflow.com/questions/17794729/test-for-significance-of-interaction-in-linear-mixed-models-in-nlme-in-r