问题
I get this error depending on which variables I include and the sequence in which I specify them in the formula:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
I've done a little research on this and it looks like it would be caused by the variable in question not being a factor variable. In this case (is_women_owned), it is a factor variable with 2 levels ("Yes", "No").
> levels(customer_accounts$is_women_owned)
[1] "No" "Yes"
No error:
f1 <- lm(combined_sales ~ is_women_owned, data=customer_accounts)
No error:
f2 <- lm(combined_sales ~ total_assets + market_value + total_empl + empl_growth + sic + city + revenue_growth + revenue + net_income + income_growth, data=customer_accounts)
Regressing on the above formula plus the factor variable "is_women_owned":
f3 <- lm(combined_sales ~ total_assets + market_value + total_empl + empl_growth + sic + city + revenue_growth + revenue + net_income + income_growth + is_women_owned, data=customer_accounts)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
I get the same error when applying stepwise linear regression, as you would expect.
This seems like a bug, it should give us a model where "is_women_owned" perhaps offers no additional explanatory value because it is highly correlated to the other variables, not error out like this.
I verified that there is no missing data for this variable, too:
> which(is.na(customer_accounts$is_women_owned))
integer(0)
Also, there are two values present in the factor variable:
customer_accounts$is_women_owned[1:20]
[1] No No No No No No No No No No No No No No Yes No
[17] No No No No
Levels: No Yes
回答1:
twofac = data.frame("y" = c(1,2,3,4,5,1), "x" = c(2,56,3,5,2,1), "f" = c("apple","apple","apple","apple","apple","banana"))
onefac = twofac[1:5,]
lm(y~x+f,data=twofac)
lm(y~x+f,data=onefac)
> str(onefac)
'data.frame': 5 obs. of 3 variables:
$ y: num 1 2 3 4 5
$ x: num 2 56 3 5 2
$ f: Factor w/ 2 levels "apple","banana": 1 1 1 1 1
> str(twofac)
'data.frame': 6 obs. of 3 variables:
$ y: num 1 2 3 4 5 1
$ x: num 2 56 3 5 2 1
$ f: Factor w/ 2 levels "apple","banana": 1 1 1 1 1 2
> lm(y~x+f,data=twofac)
Call:
lm(formula = y ~ x + f, data = twofac)
Coefficients:
(Intercept) x fbanana
3.30783 -0.02263 -2.28519
> lm(y~x+f,data=onefac)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
If you run the above you will notice twofac, a model with a 2-level factor where both factors are present, will run with no problem. onefac, a model with the same 2-level factor but only one level is present, gives the same error you got.
If your factor only has one of the levels then regressing against that factor gives no additional information as it is constant across all responsevariables
来源:https://stackoverflow.com/questions/34819810/error-when-building-regression-model-using-lm-error-in-contrasts-tmp