Logistic regression - defining reference level in R

隐身守侯 提交于 2019-12-30 00:36:51

问题


I am going nuts trying to figure this out. How can I in R, define the reference level to use in a binary logistic regression? What about the multinomial logistic regression? Right now my code is:

logistic.train.model3 <- glm(class~ x+y+z,
                         family=binomial(link=logit), data=auth, na.action = na.exclude)

my response variable is "YES" and "NO". I want to predict the probability of someone responding with "YES".

I DO NOT want to recode the variable to 0 / 1. Is there a way I can tell the model to predict "YES" ?

Thank you for your help.


回答1:


Assuming you have class saved as a factor, use the relevel() function:

auth$class <- relevel(auth$class, ref = "YES")



回答2:


Note that, when using auth$class <- relevel(auth$class, ref = "YES"), you are actually predicting "NO".

To predict "YES", the reference level must be "NO". Therefore, you have to use auth$class <- relevel(auth$class, ref = "NO").

It's a common mistake people do since most the time their oucome variable is a vector of 0 and 1, and people want to predict 1.

But when such a vector is considered as a factor variable, the reference level is 0 (see below) so that people effectively predict 1. Likewise, your reference level must be "NO" so that you will predict "YES".

set.seed(1234)
x1 <- sample(c(0, 1), 50, replace = TRUE)
x2 <- factor(x1)
str(x2)
#Factor w/ 2 levels "0","1": 1 2 2 2 2 2 1 1 2 2 ...You can see that reference level is 0


来源:https://stackoverflow.com/questions/23282048/logistic-regression-defining-reference-level-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!