问题
lets say I have following data
ind1 <- rnorm(99)
ind2 <- rnorm(99)
ind3 <- rnorm(99)
ind4 <- rnorm(99)
ind5 <- rnorm(99)
dep <- rnorm(99, mean=ind1)
group <- rep(c("A", "B", "C"), each=33)
df <- data.frame(dep,group, ind1, ind2, ind3, ind4, ind5)
the following code is calculating multiple linear regression between dependend variable and 2 independent variables by group which is exactly what I want to do. But I want to regress dep variable against all combination pair of independent variables at once. So how can I combine other models in this code?
df %>%
nest(-group) %>%
mutate(fit = map(data, ~ lm(dep ~ ind1 + ind2, data = .)),
results1 = map(fit, glance),
results2 = map(fit, tidy)) %>%
unnest(results1) %>%
unnest(results2) %>%
select(group, term, estimate, r.squared, p.value, AIC) %>%
mutate(estimate = exp(estimate))
Thanks in advance!
回答1:
Not a full tidy answer. Consider building all possible combinations of linear formulas with rapply
after initial build with lapply
and combn
then pass into your tidy method:
indvar_list <- lapply(1:5, function(x)
combn(paste0("ind", 1:5), x, , simplify = FALSE))
formulas_list <- rapply(indvar_list, function(x)
as.formula(paste("dep ~", paste(x, collapse="+"))))
run_model <- function(f) {
df %>%
nest(-group) %>%
mutate(fit = map(data, ~ lm(f, data = .)),
results1 = map(fit, glance),
results2 = map(fit, tidy)) %>%
unnest(results1) %>%
unnest(results2) %>%
select(group, term, estimate, r.squared, p.value, AIC) %>%
mutate(estimate = exp(estimate))
}
tibble_list <- lapply(formulas_list, run_model)
来源:https://stackoverflow.com/questions/56093202/how-can-i-modify-these-dplyr-code-for-multiple-linear-regression-by-combination