问题
I am going nuts trying to figure this out. How can I in R, define the reference level to use in a binary logistic regression? What about the multinomial logistic regression? Right now my code is:
logistic.train.model3 <- glm(class~ x+y+z,
family=binomial(link=logit), data=auth, na.action = na.exclude)
my response variable is "YES" and "NO". I want to predict the probability of someone responding with "YES".
I DO NOT want to recode the variable to 0 / 1. Is there a way I can tell the model to predict "YES" ?
Thank you for your help.
回答1:
Assuming you have class saved as a factor, use the relevel()
function:
auth$class <- relevel(auth$class, ref = "YES")
回答2:
Note that, when using auth$class <- relevel(auth$class, ref = "YES")
, you are actually predicting "NO".
To predict "YES", the reference level must be "NO". Therefore, you have to use auth$class <- relevel(auth$class, ref = "NO")
.
It's a common mistake people do since most the time their oucome variable is a vector of 0
and 1
, and people want to predict 1
.
But when such a vector is considered as a factor variable, the reference level is 0
(see below) so that people effectively predict 1
. Likewise, your reference level must be "NO" so that you will predict "YES".
set.seed(1234)
x1 <- sample(c(0, 1), 50, replace = TRUE)
x2 <- factor(x1)
str(x2)
#Factor w/ 2 levels "0","1": 1 2 2 2 2 2 1 1 2 2 ...You can see that reference level is 0
来源:https://stackoverflow.com/questions/23282048/logistic-regression-defining-reference-level-in-r