This question already has an answer here:
I have a dummy variable black
where black==0
is White and black==1
is Black. I am trying to fit a linear model lm
for the black==1
category only, however running the code below gives me the incorrect coefficients. Is there a way in R to run a model with the if
statement, similar to Stata?
library(foreign)
df<-read.dta("hw4.dta")
attach(df)
black[black==0]<-NA
model3<-lm(rent~I(income^2)+income+black)
If looks like there are a few issues here. First, you've stored all your data in separate vectors rent
, income
and black
. You should instead store it in a data frame:
data <- data.frame(rent, income, black)
To limit a data frame based on a logical expression, you can use the subset
function:
data.limited <- subset(data, black == 1)
Finally, you can run your analysis on your limited data frame (presumably without the black
variable):
model3 <- lm(rent~I(income^2)+income, data=data.limited)
Why not subset the data before running the model? I personally prefer using a dataframe rather than separate vectors which will make the subsetting easier.
df <- data.frame(rent, income, black)
Then subset the dataframe, o create another one
df <- df[df$black==1,]
And run the model
model3 <- lm(rent ~ I(income^2) , data=df)
The code written below should do it.
model3 <- lm(rent~I(income^2)+income+black, data=df, subset=df$black==1))
来源:https://stackoverflow.com/questions/22103339/linear-regression-in-r-with-if-statement