Stepwise regression error in R

后端 未结 2 447
悲哀的现实
悲哀的现实 2021-01-24 07:11

I want to run a stepwise regression in R to choose the best fit model, my code is attached here:

full.modelfixed <- glm(died_ed ~ age_1 + gender + race + insu         


        
相关标签:
2条回答
  • 2021-01-24 07:44

    This should probably be in a stats forum (stats.stackexchange) but briefly there are a number of considerations.

    The main one is that when comparing two models they need to be fitted on the same dataset (i.e you need to be able to nest the models within each other).

    For examples

    glm1 <- glm(Dependent~indep1+indep2+indep3, family = binomial, data = data)
    glm2 <- glm(Dependent~indep2+indep2, family = binomial, data = data)
    

    Now imagine that we are missing values of indep3 but not indep1 or indep2. When we run glm1 we are running it on a smaller dataset - the dataset for which we have the dependent variable and all three independent ones (i.e we exclude any rows where indep3 values are missing).

    When we run glm2 the rows missing a value for indep3 are included because those rows do contain dependent, indep1 and indep2 which are the models in the variable.

    We can no longer directly compare models as they are fitted on different datasets.

    I think broadly you can either 1) Limit to data which is complete 2) If appropriate consider multiple imputation

    Hope that helps.

    0 讨论(0)
  • 2021-01-24 07:47

    You can use the MICE package to do imputation, then working with the dataset will not give you errors

    0 讨论(0)
提交回复
热议问题