All possible Regression in R: Saving coefficients in a matrix

前端未结

关注

 2  1546

I am running code for all possible models of a phylogenetic generalised linear model. The issue I am having is extracting and saving the beta coefficients for each model. <

相关标签:

2条回答

爱一瞬间的悲伤

2020-12-20 09:04

for(i in 1:length(formula)){
    fit = lm(formula(formula), data)
     beta[i, 1:length(fit$coefficients)] <- fit$coefficients
}

Update

Idea: name your columns after coefficients, and assign values to columns by name.

It is just a dummy example but should help you: Create your output matrix:

beta <- matrix(NA,  nrow=7, ncol=4)
colnames(beta) <- c("(Intercept)", 'A', 'B', 'C')

Create some dummy data:

 A <- rnorm(10)
 B <- rpois(10, 1)
 C <- rnorm(10, 2)
 Y <- rnorm(10, -1)

Now you can do something like that:

fit <- lm(Y ~ A + B + C)
beta[1, names(fit$coefficients)] <- fit$coefficients

fit <- lm(Y ~ A + B)
beta[2, names(fit$coefficients)] <- fit$coefficients

fit <- lm(Y ~ A + C)
beta[3, names(fit$coefficients)] <- fit$coefficients

fit <- lm(Y ~ B + C)
beta[4, names(fit$coefficients)] <- fit$coefficients

fit <- lm(Y ~ A)
beta[5, names(fit$coefficients)] <- fit$coefficients

fit <- lm(Y ~ B)
beta[6, names(fit$coefficients)] <- fit$coefficients

fit <- lm(Y ~ C)
beta[7, names(fit$coefficients)] <- fit$coefficients

0 讨论(0)

感情败类

2020-12-20 09:15
How about using names and %in% to subset the right columns. Extract the coefficient values using coef. Like this:
```
beta = matrix(NA, nrow = length(formula), ncol = 3)
colnames(beta) <- colnames(inpdv)

for(i in 1:length(formula)){
   fit = lm(formula(formula[i]), data)
    coefs <- coef(fit)
    beta[ i , colnames(beta) %in% names( coefs ) ] <- coefs[ names( coefs ) %in% colnames( beta ) ]
}
#              A          B         C
#[1,] -0.4229837 -0.0519900 0.3787666
#[2,]         NA  0.7015679 0.0555350
#[3,] -0.4165834         NA 0.3692974
#[4,]         NA         NA 0.1346726
#[5,] -0.2035173  0.7049951        NA
#[6,]         NA  0.7978726        NA
#[7,] -0.2229959         NA        NA
#[8,]         NA         NA        NA
```
I think it's perfectly acceptable to use a for loop here. For starters using something like lapply sometimes keep increasing memory usage as you run more and more of the simulations. R will sometimes not mark objects from earlier models as trash until the lapply loop finishes so so can sometimes get a memory allocation error. Using the for loop I find that R will reuse memory allocated to the previous iteration of the loop if necessary so if you can run the loop once, you can run it lots of times.

The other reason not to use a for loop is speed, but I would assume that the time to iterate is negligible compared to the time to fit the model so I would use it.
0 讨论(0)
发布评论:

提交评论
- 加载中...