Many regressions using tidyverse and broom: Same dependent variable, different independent variables

ぐ巨炮叔叔 提交于 2021-02-08 04:49:33

问题


This link shows how to answer my question in the case where we have the same independent variables, but potentially many different dependent variables: Use broom and tidyverse to run regressions on different dependent variables.

But my question is, how can I apply the same approach (e.g., tidyverse and broom) to run many regressions where we have the reverse situation: same dependent variables but different independent variable. In line with the code in the previous link, something like:

mod = lm(health ~ cbind(sex,income,happiness) + faculty, ds) %>% tidy()

However, this code does not do exactly what I want, and instead, produces:

Call:
lm(formula = income ~ cbind(sex, health) + faculty, data = ds)

Coefficients:
             (Intercept)     cbind(sex, health)sex  
                 945.049                   -47.911  
cbind(sex, health)health                   faculty  
                   2.342                     1.869 

which is equivalent to:

lm(formula = income ~ sex + health + faculty, data = ds)

回答1:


Basically you'll need some way to create all the different formulas you want. Here's one way

qq <- expression(sex,income,happiness)
formulae <- lapply(qq, function(v) bquote(health~.(v)+faculty))
# [[1]]
# health ~ sex + faculty
# [[2]]
# health ~ income + faculty
# [[3]]
# health ~ happiness + faculty

Once you have all your formula, you can map them to lm and then to tidy()

library(purrr)
library(broom)

formulae %>% map(~lm(.x, ds)) %>% map_dfr(tidy, .id="model")
# A tibble: 9 x 6
#   model term         estimate std.error statistic  p.value
#   <chr> <chr>           <dbl>     <dbl>     <dbl>    <dbl>
# 1 1     (Intercept) 19.5        0.504     38.6    1.13e-60
# 2 1     sex          0.755      0.651      1.16   2.49e- 1
# 3 1     faculty     -0.00360    0.291     -0.0124 9.90e- 1
# 4 2     (Intercept) 19.8        1.70      11.7    3.18e-20
# 5 2     income      -0.000244   0.00162   -0.150  8.81e- 1
# 6 2     faculty      0.143      0.264      0.542  5.89e- 1
# 7 3     (Intercept) 18.4        1.88       9.74   4.79e-16
# 8 3     happiness    0.205      0.299      0.684  4.96e- 1
# 9 3     faculty      0.141      0.262      0.539  5.91e- 1

Using sample data

set.seed(11)
ds <- data.frame(income = rnorm(100, mean=1000,sd=200),
             happiness = rnorm(100, mean = 6, sd=1),
             health = rnorm(100, mean=20, sd = 3),
             sex = c(0,1),
             faculty = c(0,1,2,3))



回答2:


You could use the combn function to get all combinations of n independent variables and then iterate over them. Let's say n=3 here:

library(tidyverse)

ds <- data.frame(income = rnorm(100, mean=1000,sd=200),
                 happiness = rnorm(100, mean = 6, sd=1),
                 health = rnorm(100, mean=20, sd = 3),
                 sex = c(0,1),
                 faculty = c(0,1,2,3))

ivs = combn(names(ds)[names(ds)!="income"], 3, simplify=FALSE)
# Or, to get all models with 1 to 4 variables:
# ivs = map(1:4, ~combn(names(ds)[names(ds)!="income"], .x, simplify=FALSE)) %>% 
#         flatten()

names(ivs) = map(ivs, ~paste(.x, collapse="-"))

models = map(ivs, 
             ~lm(as.formula(paste("income ~", paste(.x, collapse="+"))), data=ds))

map_df(models, broom::tidy, .id="model")
   model                    term        estimate std.error statistic  p.value
 * <chr>                    <chr>          <dbl>     <dbl>     <dbl>    <dbl>
 1 happiness-health-sex     (Intercept)  1086.      201.      5.39   5.00e- 7
 2 happiness-health-sex     happiness     -25.4      21.4    -1.19   2.38e- 1
 3 happiness-health-sex     health          3.58      6.99    0.512  6.10e- 1
 4 happiness-health-sex     sex            11.5      41.5     0.277  7.82e- 1
 5 happiness-health-faculty (Intercept)  1085.      197.      5.50   3.12e- 7
 6 happiness-health-faculty happiness     -25.8      20.9    -1.23   2.21e- 1
 7 happiness-health-faculty health          3.45      6.98    0.494  6.23e- 1
 8 happiness-health-faculty faculty         7.86     18.2     0.432  6.67e- 1
 9 happiness-sex-faculty    (Intercept)  1153.      141.      8.21   1.04e-12
10 happiness-sex-faculty    happiness     -25.9      21.4    -1.21   2.28e- 1
11 happiness-sex-faculty    sex             3.44     46.2     0.0744 9.41e- 1
12 happiness-sex-faculty    faculty         7.40     20.2     0.366  7.15e- 1
13 health-sex-faculty       (Intercept)   911.      143.      6.35   7.06e- 9
14 health-sex-faculty       health          3.90      7.03    0.554  5.81e- 1
15 health-sex-faculty       sex            15.6      45.6     0.343  7.32e- 1
16 health-sex-faculty       faculty         7.02     20.4     0.345  7.31e- 1


来源:https://stackoverflow.com/questions/61512343/many-regressions-using-tidyverse-and-broom-same-dependent-variable-different-i

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!