Extract Formula from lm including Categorical Variables (R)

倖福魔咒の 提交于 2020-01-25 09:51:09

问题


I have an lm object and want to get the formula extracted with coefficients. This object includes categorical variables like month, as well as interactions with these categorical variables and numeric ones.

Another user helped with some code that works for all but the categorical variables, however when I add a categorical variable (eg. d here) it breaks down and gives the error "Error in parse(text = x) : :1:785: unexpected numeric constant":

a = c(1, 2, 5, 13, 40, 29, 82, 22, 34, 54, 12, 31, 21, 29, 31, 42)
b = c(12, 15, 20, 12, 34, 56, 12, 12, 15, 20, 12, 34, 56, 12, 32, 41)
c = c(20, 30, 40, 18, 72, 34, 12, 40, 18, 72, 28, 65, 21, 32, 42, 52)
d = structure(c(8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 
8L, 1L, 9L, 7L), .Label = c("April", "August", "December", 
"February", "January", "July", "June", "March", "May", "November", 
"October", "September"), class = "factor")


model = lm(a~b+c+factor(d))

as.formula(
  paste0("y ~ ", round(coefficients(model)[1],2), " + ", 
    paste(sprintf("%.2f * %s", 
                  coefficients(model)[-1],  
                  names(coefficients(model)[-1])), 
          collapse=" + ")
  )
)

What I get from above is "Error in parse(text = x) : :1:53: unexpected symbol 1: y ~ -7 + 14.23 * b + -6.82 * c + -529.30 * factor(d)August

When I'd like is to get the full formula, with each of the months multiplied by a coefficient (or in this case only 3 of them, in my actual dataset I have much more data and all months happen at least 8 times). But it stalls here, in this example with 'unexpected symbol' and in my actual data with "Error in parse(text = x) : :1:785: unexpected numeric constant" and without even trying to do a month like it does here (not sure why the difference between the example and actual code).

My formulas are quite large, so it needs to be able to scale up (which the current code does).


回答1:


What you are creating is not a valid formula in R, therefore don't try and coerce the results of sprintf into a formula.

Therefore something like

sprintf(' y ~ %.2f + %s', coef(model)[1], 
   paste(sprintf('(%.2f) * %s',
          coef(model)[-1], names(coef(model)[-1]) ), collapse ='+'))



回答2:


In your model you have 5 explanatory variables and only 3 data points. See summary(model).



来源:https://stackoverflow.com/questions/21321325/extract-formula-from-lm-including-categorical-variables-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!