I am running code for all possible models of a phylogenetic generalised linear model. The issue I am having is extracting and saving the beta coefficients for each model. <
for(i in 1:length(formula)){
fit = lm(formula(formula), data)
beta[i, 1:length(fit$coefficients)] <- fit$coefficients
}
Update
Idea: name your columns after coefficients, and assign values to columns by name.
It is just a dummy example but should help you: Create your output matrix:
beta <- matrix(NA, nrow=7, ncol=4)
colnames(beta) <- c("(Intercept)", 'A', 'B', 'C')
Create some dummy data:
A <- rnorm(10)
B <- rpois(10, 1)
C <- rnorm(10, 2)
Y <- rnorm(10, -1)
Now you can do something like that:
fit <- lm(Y ~ A + B + C)
beta[1, names(fit$coefficients)] <- fit$coefficients
fit <- lm(Y ~ A + B)
beta[2, names(fit$coefficients)] <- fit$coefficients
fit <- lm(Y ~ A + C)
beta[3, names(fit$coefficients)] <- fit$coefficients
fit <- lm(Y ~ B + C)
beta[4, names(fit$coefficients)] <- fit$coefficients
fit <- lm(Y ~ A)
beta[5, names(fit$coefficients)] <- fit$coefficients
fit <- lm(Y ~ B)
beta[6, names(fit$coefficients)] <- fit$coefficients
fit <- lm(Y ~ C)
beta[7, names(fit$coefficients)] <- fit$coefficients
How about using names
and %in%
to subset the right columns. Extract the coefficient values using coef
. Like this:
beta = matrix(NA, nrow = length(formula), ncol = 3)
colnames(beta) <- colnames(inpdv)
for(i in 1:length(formula)){
fit = lm(formula(formula[i]), data)
coefs <- coef(fit)
beta[ i , colnames(beta) %in% names( coefs ) ] <- coefs[ names( coefs ) %in% colnames( beta ) ]
}
# A B C
#[1,] -0.4229837 -0.0519900 0.3787666
#[2,] NA 0.7015679 0.0555350
#[3,] -0.4165834 NA 0.3692974
#[4,] NA NA 0.1346726
#[5,] -0.2035173 0.7049951 NA
#[6,] NA 0.7978726 NA
#[7,] -0.2229959 NA NA
#[8,] NA NA NA
I think it's perfectly acceptable to use a for
loop here. For starters using something like lapply
sometimes keep increasing memory usage as you run more and more of the simulations. R will sometimes not mark objects from earlier models as trash until the lapply
loop finishes so so can sometimes get a memory allocation error. Using the for
loop I find that R will reuse memory allocated to the previous iteration of the loop if necessary so if you can run the loop once, you can run it lots of times.
The other reason not to use a for
loop is speed, but I would assume that the time to iterate is negligible compared to the time to fit the model so I would use it.