Stepwise regression error in R

纵饮孤独 提交于 2020-01-30 12:10:21

问题


I want to run a stepwise regression in R to choose the best fit model, my code is attached here:

full.modelfixed <- glm(died_ed ~ age_1 + gender + race + insurance + injury + ais + blunt_pen + 
               comorbid + iss +min_dist + pop_dens_new + age_mdn + male_pct + 
               pop_wht_pct + pop_blk_pct + unemp_pct + pov_100x_npct +
               urban_pct, data = trauma, family = binomial (link = 'logit'), na.action = na.exclude)
reduced.modelfixed <- stepAIC(full.modelfixed, direction = "backward")

There is a error message said

Error in stepAIC(full.modelfixed, direction = "backward") :   
number of rows in use has changed: remove missing values?

Almost every variable in the data has some missing values, so I cannot delete all missing values (data = na.omit(data))

Any idea on how to fix this?

Thanks!!


回答1:


This should probably be in a stats forum (stats.stackexchange) but briefly there are a number of considerations.

The main one is that when comparing two models they need to be fitted on the same dataset (i.e you need to be able to nest the models within each other).

For examples

glm1 <- glm(Dependent~indep1+indep2+indep3, family = binomial, data = data)
glm2 <- glm(Dependent~indep2+indep2, family = binomial, data = data)

Now imagine that we are missing values of indep3 but not indep1 or indep2. When we run glm1 we are running it on a smaller dataset - the dataset for which we have the dependent variable and all three independent ones (i.e we exclude any rows where indep3 values are missing).

When we run glm2 the rows missing a value for indep3 are included because those rows do contain dependent, indep1 and indep2 which are the models in the variable.

We can no longer directly compare models as they are fitted on different datasets.

I think broadly you can either 1) Limit to data which is complete 2) If appropriate consider multiple imputation

Hope that helps.




回答2:


You can use the MICE package to do imputation, then working with the dataset will not give you errors



来源:https://stackoverflow.com/questions/46817564/stepwise-regression-error-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!