R and factor coding in formula

廉价感情. 提交于 2019-12-25 08:19:52

问题


How do I use the formula interface if I want custom valued dummies, e.g. if I want values 1 and two, rather than 0 and 1. The estimation might look like the following where supp is a factor variable.

fit <- lm(len ~ dose + supp, data = ToothGrowth)

In this example, there is not much use of the different values, but in many cases of a "re-written" model it can be useful.

EDIT: Actually, I have e.g. 3 levels, and want the two columns to be coded differently, so one is a 1/0 variable, and the other is a 1/2 variable. The above example only has two levels.


回答1:


You can set the contrasts to be whatever you want by creating the matrix you want to use and setting it either to the contrasts argument of lm or setting the default contrast of the factor itself.

Some sample data:

set.seed(6)
d <- data.frame(g=gl(3,5,labels=letters[1:3]), x=round(rnorm(15,50,20)))

The contrasts you have in mind:

mycontrasts <- matrix(c(0,0,1,0,1,1), byrow=TRUE, nrow=3)
colnames(mycontrasts) <- c("12","23")
mycontrasts
#     12 23
#[1,]  0  0
#[2,]  1  0
#[3,]  1  1

Then you use this in the lm call:

> lm(x ~ g, data=d, contrasts=list(g=mycontrasts))

Call:
lm(formula = x ~ g, data = d, contrasts = list(g = mycontrasts))

Coefficients:
(Intercept)          g12          g23  
       58.8        -13.6          5.8  

We can check that it does the right thing by comparing the means:

> diff(tapply(d$x, d$g, mean))
    b     c 
-13.6   5.8 

The default contrast is to use the first level as baseline:

> lm(x ~ g, data=d)

Call:
lm(formula = x ~ g, data = d)

Coefficients:
(Intercept)           gb           gc  
       58.8        -13.6         -7.8  

But that can be changed with the contrasts command:

> contrasts(d$g) <- mycontrasts
> lm(x ~ g, data=d)

Call:
lm(formula = x ~ g, data = d)

Coefficients:
(Intercept)          g12          g23  
       58.8        -13.6          5.8  


来源:https://stackoverflow.com/questions/9616742/r-and-factor-coding-in-formula

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!