I am confused with the way predict.glm function in R works. According to the help,
The \"terms\" option returns a matrix giving the
I have already edited your question, to include "correct" way of getting (raw) model matrix, model coefficients, and your intended term-wise prediction. So your other question on how to get these are already solved. In the following, I shall help you understand predict.glm()
.
predict.glm()
(actually, predict.lm()
) has applied centring constraints for each model term when doing term-wise prediction.
Initially, you have a model matrix
X <- model.matrix(y~(x==1)+(x==2), data = test.data)
but it is centred, by dropping column means:
avx <- colMeans(X)
X1 <- sweep(X, 2L, avx)
> avx
(Intercept) x == 1TRUE x == 2TRUE
1.0000000 0.2222222 0.3333333
> X1
(Intercept) x == 1TRUE x == 2TRUE
1 0 0.7777778 -0.3333333
2 0 -0.2222222 0.6666667
3 0 -0.2222222 -0.3333333
4 0 0.7777778 -0.3333333
5 0 -0.2222222 0.6666667
6 0 -0.2222222 0.6666667
7 0 -0.2222222 -0.3333333
8 0 -0.2222222 -0.3333333
9 0 -0.2222222 -0.3333333
Then term-wise computation is done using this centred model matrix:
t(beta*t(X1))
(Intercept) x == 1TRUE x == 2TRUE
1 0 -0.8544762 0.1351550
2 0 0.2441361 -0.2703101
3 0 0.2441361 0.1351550
4 0 -0.8544762 0.1351550
5 0 0.2441361 -0.2703101
6 0 0.2441361 -0.2703101
7 0 0.2441361 0.1351550
8 0 0.2441361 0.1351550
9 0 0.2441361 0.1351550
After centring, different terms are vertically shifted to have zero mean. As a result, intercept will be come 0. No worry, a new intercept is computed, by aggregating shifts of all model terms:
intercept <- as.numeric(crossprod(avx, beta))
# [1] 0.7193212
Now you should have seen what predict.glm(, type = "terms")
gives you.