How to remove a lower order parameter in a model when the higher order parameters remain?

后端 未结 2 646
余生分开走
余生分开走 2021-02-14 18:19

The problem: I cannot remove a lower order parameter (e.g., a main effects parameter) in a model as long as the higher order parameters (i.e., interactions) rem

相关标签:
2条回答
  • 2021-02-14 19:00

    Here's a sort of answer; there is no way that I know of to formulate this model directly by the formula ...

    Construct data as above:

    d <- data.frame(A = rep(c("a1", "a2"), each = 50),
                    B = c("b1", "b2"), value = rnorm(100))
    options(contrasts=c('contr.sum','contr.poly'))
    

    Confirm original finding that just subtracting the factor from the formula doesn't work:

    m1 <- lm(value ~ A * B, data = d)
    coef(m1)
    ## (Intercept)          A1          B1       A1:B1 
    ## -0.23766309  0.04651298 -0.13019317 -0.06421580 
    
    m2 <- update(m1, .~. - A)
    coef(m2)
    ## (Intercept)          B1      Bb1:A1      Bb2:A1 
    ## -0.23766309 -0.13019317 -0.01770282  0.11072877 
    

    Formulate the new model matrix:

    X0 <- model.matrix(m1)
    ## drop Intercept column *and* A from model matrix
    X1 <- X0[,!colnames(X0) %in% "A1"]
    

    lm.fit allows direct specification of the model matrix:

    m3 <- lm.fit(x=X1,y=d$value)
    coef(m3)
    ## (Intercept)          B1       A1:B1 
    ## -0.2376631  -0.1301932  -0.0642158 
    

    This method only works for a few special cases that allow the model matrix to be specified explicitly (e.g. lm.fit, glm.fit).

    More generally:

    ## need to drop intercept column (or use -1 in the formula)
    X1 <- X1[,!colnames(X1) %in% "(Intercept)"]
    ## : will confuse things -- substitute something inert
    colnames(X1) <- gsub(":","_int_",colnames(X1))
    newf <- reformulate(colnames(X1),response="value")
    m4 <- lm(newf,data=data.frame(value=d$value,X1))
    coef(m4)
    ## (Intercept)          B1   A1_int_B1 
    ##  -0.2376631  -0.1301932  -0.0642158 
    

    This approach has the disadvantage that it won't recognize multiple input variables as stemming from the same predictor (i.e., multiple factor levels from a more-than-2-level factor).

    0 讨论(0)
  • 2021-02-14 19:06

    I think the most straightforward solution is to use model.matrix. Possibly, you could achieve what you want with some fancy footwork and custom contrasts. However, if you want "type 3 esque" p-values, You probably want it for every term in your model, in which case, I think my approach with model.matrix is convenient anyway because you can easily implicitly loop through all models dropping one column at a time. The provision of a possible approach is not an endorsement of the statistical merits of it, but I do think you formulated a clear question and seem to know it may be unsound statistically so I see no reason not to answer it.

    ## initial data
    set.seed(10)
    d <- data.frame(
      A = rep(c("a1", "a2"), each = 50),
      B = c("b1", "b2"),
      value = rnorm(100))
    
    options(contrasts=c('contr.sum','contr.poly'))
    
    ## create design matrix
    X <- model.matrix(~ A * B, data = d)
    
    ## fit models dropping one effect at a time
    ## change from 1:ncol(X) to 2:ncol(X)
    ## to avoid a no intercept model
    m <- lapply(1:ncol(X), function(i) {
      lm(value ~ 0 + X[, -i], data = d)
    })
    ## fit (and store) the full model
    m$full <- lm(value ~ 0 + X, data = d)
    ## fit the full model in usual way to compare
    ## full and regular should be equivalent
    m$regular <- lm(value ~ A * B, data = d)
    ## extract and view coefficients
    lapply(m, coef)
    

    This results in this final output:

    [[1]]
       X[, -i]A1    X[, -i]B1 X[, -i]A1:B1 
      -0.2047465   -0.1330705    0.1133502 
    
    [[2]]
    X[, -i](Intercept)          X[, -i]B1       X[, -i]A1:B1 
            -0.1365489         -0.1330705          0.1133502 
    
    [[3]]
    X[, -i](Intercept)          X[, -i]A1       X[, -i]A1:B1 
            -0.1365489         -0.2047465          0.1133502 
    
    [[4]]
    X[, -i](Intercept)          X[, -i]A1          X[, -i]B1 
            -0.1365489         -0.2047465         -0.1330705 
    
    $full
    X(Intercept)          XA1          XB1       XA1:B1 
      -0.1365489   -0.2047465   -0.1330705    0.1133502 
    
    $regular
    (Intercept)          A1          B1       A1:B1 
     -0.1365489  -0.2047465  -0.1330705   0.1133502 
    

    That is nice so far for models using lm. You mentioned this is ultimately for lmer(), so here is an example using mixed models. I believe it may become more complex if you have more than a random intercept (i.e., effects need to be dropped from the fixed and random parts of the model).

    ## mixed example
    require(lme4)
    
    ## data is a bit trickier
    set.seed(10)
    mixed <- data.frame(
      ID = factor(ID <- rep(seq_along(n <- sample(3:8, 60, TRUE)), n)),
      A = sample(c("a1", "a2"), length(ID), TRUE),
      B = sample(c("b1", "b2"), length(ID), TRUE),
      value = rnorm(length(ID), 3) + rep(rnorm(length(n)), n))
    
    ## model matrix as before
    X <- model.matrix(~ A * B, data = mixed)
    
    ## as before but allowing a random intercept by ID
    ## becomes trickier if you need to add/drop random effects too
    ## and I do not show an example of this
    mm <- lapply(1:ncol(X), function(i) {
      lmer(value ~ 0 + X[, -i] + (1 | ID), data = mixed)
    })
    
    ## full model
    mm$full <- lmer(value ~ 0 + X + (1 | ID), data = mixed)
    ## full model regular way
    mm$regular <- lmer(value ~ A * B + (1 | ID), data = mixed)
    
    ## view all the fixed effects
    lapply(mm, fixef)
    

    Which gives us...

    [[1]]
       X[, -i]A1    X[, -i]B1 X[, -i]A1:B1 
     0.009202554  0.028834041  0.054651770 
    
    [[2]]
    X[, -i](Intercept)          X[, -i]B1       X[, -i]A1:B1 
            2.83379928         0.03007969         0.05992235 
    
    [[3]]
    X[, -i](Intercept)          X[, -i]A1       X[, -i]A1:B1 
            2.83317191         0.02058800         0.05862495 
    
    [[4]]
    X[, -i](Intercept)          X[, -i]A1          X[, -i]B1 
            2.83680235         0.01738798         0.02482256 
    
    $full
    X(Intercept)          XA1          XB1       XA1:B1 
      2.83440919   0.01947658   0.02928676   0.06057778 
    
    $regular
    (Intercept)          A1          B1       A1:B1 
     2.83440919  0.01947658  0.02928676  0.06057778 
    
    0 讨论(0)
提交回复
热议问题