Custom contrasts in R: contrast coefficient matrix or contrast matrix / coding scheme? And how to get there?

给你一囗甜甜゛ 提交于 2019-12-04 10:29:48

问题


Custom contrasts are very widely used in analyses, e.g.: "Do DV values at level 1 and level 3 of this three-level factor differ significantly?"

Intuitively, this contrast is expressed in terms of cell means as:

c(1,0,-1)

One or more of these contrasts, bound as columns, form a contrast coefficient matrix, e.g.

mat = matrix(ncol = 2, byrow = TRUE, data = c(
    1,  0,
    0,  1,
   -1, -1)
)
     [,1] [,2]
[1,]    1    0
[2,]    0    1
[3,]   -1   -1

However, when it comes to running these contrasts specified by the coefficient matrix, there is a lot of (apparently contradictory) information on the web and in books. My question is which information is correct?

Claim 1: contrasts(factor) takes a coefficient matrix

In some examples, the user is shown that the intuitive contrast coefficient matrix can be used directly via the contrasts() or C() functions. So it's as simple as:

contrasts(myFactor) <- mat

Claim 2: Transform coefficients to create a coding scheme

Elsewhere (e.g. UCLA stats) we are told the coefficient matrix (or basis matrix) must be transformed from a coefficient matrix into a contrast matrix before use. This involves taking the inverse of the transform of the coefficient matrix: (mat')⁻¹, or, in Rish:

contrasts(myFactor) = solve(t(mat))

This method requires padding the matrix with an initial column of means for the intercept. To avoid this, some sites recommend using a generalized inverse function which can cope with non-square matrices, i.e., MASS::ginv()

contrasts(myFactor) = ginv(t(mat))

Third option: premultiply by the transform, take the inverse, and post multiply by the transform

Elsewhere again (e.g. a note from SPSS support), we learn the correct algebra is: (mat'mat)-¹ mat'

Implying to me that the correct way to create the contrasts matrix should be:

x = solve(t(mat)%*% mat)%*% t(mat)
     [,1] [,2] [,3]
[1,]    0    0    1
[2,]    1    0   -1
[3,]    0    1   -1

contrasts(myFactor) = x

My question is, which is right? (If I am interpreting and describing each piece of advice accurately). How does one specify custom contrasts in R for lm, lme etc?

Refs


回答1:


Claim 2 is correct (see the answers here and here) and sometimes claim 1, too. This is because there are cases in which the generalized inverse of the (transposed) coefficient matrix is equal to the matrix itself.




回答2:


For what it's worth....

If you have a factor with 3 levels (levels A, B, and C) and you want to test the following orthogonal contrasts: A vs B, and the avg. of A and B vs C, your contrast codes would be:

Cont1<- c(1,-1, 0)
Cont2<- c(.5,.5, -1)

If you do as directed on the UCLA site (transform coefficients to make a coding scheme), as such:

Contrasts(Variable)<- solve(t(cbind(c(1,1,1), Cont1, Cont2)))[,2:3]

then your results are IDENTICAL to if you had created two dummy variables (e.g.:

Dummy1<- ifelse(Variable=="A", 1, ifelse(Variable=="B", -1, 0))
Dummy2<- ifelse(Variable=="A", .5, ifelse(Variable=="B", .5, -1))

and entered them both into the regression equation instead of your factor, which makes me inclined to think that this is the correct way.

PS I don't write the most elegant R code, but it gets the job done. Sorry, I'm sure there are easier ways to recode variables, but you get the gist.




回答3:


I'm probably missing something, but in each of your three examples, you specify the contrast matrix in the same way, i.e.

## Note it should plural of contrast
contrasts(myFactor) = x

The only thing that differs is the value of x.

Using the data from the UCLA website as an example

hsb2 = read.table('http://www.ats.ucla.edu/stat/data/hsb2.csv', header=T, sep=",")

#creating the factor variable race.f
hsb2$race.f = factor(hsb2$race, labels=c("Hispanic", "Asian", "African-Am", "Caucasian"))

We can specify either the treatment version of the contrasts

contrasts(hsb2$race.f) = contr.treatment(4)
summary(lm(write ~ race.f, hsb2))

or the sum version

contrasts(hsb2$race.f) = contr.sum(4)
summary(lm(write ~ race.f, hsb2))

Alternatively, we can specify a bespoke contrast matrix.

See ?contr.sum for other standard contrasts.



来源:https://stackoverflow.com/questions/31818174/custom-contrasts-in-r-contrast-coefficient-matrix-or-contrast-matrix-coding-s

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!