What does predict.glm(, type=“terms”) actually do?

前端 未结 1 546
伪装坚强ぢ
伪装坚强ぢ 2020-12-31 17:35

I am confused with the way predict.glm function in R works. According to the help,

The \"terms\" option returns a matrix giving the

相关标签:
1条回答
  • 2020-12-31 18:10

    I have already edited your question, to include "correct" way of getting (raw) model matrix, model coefficients, and your intended term-wise prediction. So your other question on how to get these are already solved. In the following, I shall help you understand predict.glm().


    predict.glm() (actually, predict.lm()) has applied centring constraints for each model term when doing term-wise prediction.

    Initially, you have a model matrix

    X <- model.matrix(y~(x==1)+(x==2), data = test.data)
    

    but it is centred, by dropping column means:

    avx <- colMeans(X)
    X1 <- sweep(X, 2L, avx)
    
    > avx
    (Intercept)  x == 1TRUE  x == 2TRUE 
      1.0000000   0.2222222   0.3333333 
    
    > X1
      (Intercept) x == 1TRUE x == 2TRUE
    1           0  0.7777778 -0.3333333
    2           0 -0.2222222  0.6666667
    3           0 -0.2222222 -0.3333333
    4           0  0.7777778 -0.3333333
    5           0 -0.2222222  0.6666667
    6           0 -0.2222222  0.6666667
    7           0 -0.2222222 -0.3333333
    8           0 -0.2222222 -0.3333333
    9           0 -0.2222222 -0.3333333
    

    Then term-wise computation is done using this centred model matrix:

    t(beta*t(X1))
    
      (Intercept) x == 1TRUE x == 2TRUE
    1           0 -0.8544762  0.1351550
    2           0  0.2441361 -0.2703101
    3           0  0.2441361  0.1351550
    4           0 -0.8544762  0.1351550
    5           0  0.2441361 -0.2703101
    6           0  0.2441361 -0.2703101
    7           0  0.2441361  0.1351550
    8           0  0.2441361  0.1351550
    9           0  0.2441361  0.1351550
    

    After centring, different terms are vertically shifted to have zero mean. As a result, intercept will be come 0. No worry, a new intercept is computed, by aggregating shifts of all model terms:

    intercept <- as.numeric(crossprod(avx, beta))
    # [1] 0.7193212
    

    Now you should have seen what predict.glm(, type = "terms") gives you.

    0 讨论(0)
提交回复
热议问题