How to change contrasts to compare with mean of all levels rather than reference level (R, lmer)?

为君一笑 提交于 2020-12-13 04:48:10

问题


I have a dataset for which each row is one visit to a store by a salesperson and the fields include "outlet" (store ID), "devices" (how many electronic devices the salesperson sold) and "weekday" (the day of the week on which the salesperson was in the store).

I want to work out whether one weekday is better than the others for sales, so instead of comparing all the days of the week to e.g. Monday I want to compare them to the mean of all the days of the week. I am using the lmerTest function (lme4::lmer with estimated p-values) for this.

I have tried the following code:

data$weekday <- factor(weekday_sales$weekday, levels=c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"))

contrasts(data$weekday) = contr.sum(7) 

summary(lmerTest::lmer(data=data, devices~weekday + (1|outlet)))

which gives:

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)   4.3681     0.6024  12.4472   7.251 8.24e-06 ***
weekday1     -1.0585     0.5129 145.7337  -2.064  0.04080 *  
weekday2     -0.2830     0.4958 142.3214  -0.571  0.56913    
weekday3      1.1884     0.4907 140.5545   2.422  0.01671 *  
weekday4      0.1100     0.5025 145.1407   0.219  0.82707    
weekday5      1.3589     0.5135 143.8204   2.646  0.00904 ** 
weekday6     -0.1629     0.5020 143.1605  -0.325  0.74600   

However there were all seven weekdays in the dataset (one is missing) and the levels of the weekdays in the dataset are stored as "Monday", "Tuesday", "Wednesday" etc. not as "weekday1", "weekday2" etc.

Why is there one weekday missing and how do I know which one this is? Does this compare each weekday to the mean or is it doing something else? (And if so how do I change the contrasts to compare all levels to the mean of all levels?)


回答1:


The problem is that with sum contrasts, you can't compare all groups to the overall mean because they aren't independent. If you know the grand mean G and then the means of days 1 -6, then the mean of day 7 can be calculated from the values you already have. So basically, you can't do it using contrasts - you'd need a post-hoc test of some kind.

With the standard treatment contrasts, you still only make six comparisons (1-2, 1-3, 1-4, 1-5, 1-6, 1-7) and the usual question is: hey, where did 1 go. The answer there is that it is the intercept. Here, you have G-1, G-2, G-3, G-4, G-5, G-6 and then lose G-7.




回答2:


You need to explicitly suppress the intercept:

devices~ -1 + weekday  + (1|outlet))

or

devices ~ 0 + weekday  + (1|outlet))

It's not particularly clear, but when you use sum-to-zero contrasts, the first parameter is (level 1 - mean), the second is (level 2 - mean), etc., so the comparison that's missing is the last level: "Sunday vs. mean".

set.seed(101)
w <- c("Monday", "Tuesday", "Wednesday", "Thursday", 
       "Friday", "Saturday", "Sunday")
dd <- data.frame(w=factor(rep(w,10),levels=w),y=rnorm(70))
m0 <- lm(y~w,dd, contrasts=list(w=contr.sum))
m1 <- lm(y~w-1,dd, contrasts=list(w=contr.sum))


来源:https://stackoverflow.com/questions/59250992/how-to-change-contrasts-to-compare-with-mean-of-all-levels-rather-than-reference

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!