lm

Faster alternative to R car::Anova for sum of square crossproduct matrix calculation for subsets of predictors

穿精又带淫゛_ 提交于 2021-01-07 01:43:30
问题 I need to compute the sum of squares crossproduct matrix (indeed the trace of this matrix) in a multivariate linear model, with Y (n x q) and X (n x p). Standard R code for doing that is: require(MASS) require(car) # Example data q <- 10 n <- 1000 p <- 10 Y <- mvrnorm(n, mu = rep(0, q), Sigma = diag(q)) X <- as.data.frame(mvrnorm(n, mu = rnorm(p), Sigma = diag(p))) # Fit lm fit <- lm( Y ~ ., data = X ) # Type I sums of squares summary(manova(fit))$SS # Type III sums of squares type = 3 #

Faster alternative to R car::Anova for sum of square crossproduct matrix calculation for subsets of predictors

左心房为你撑大大i 提交于 2021-01-07 01:40:21
问题 I need to compute the sum of squares crossproduct matrix (indeed the trace of this matrix) in a multivariate linear model, with Y (n x q) and X (n x p). Standard R code for doing that is: require(MASS) require(car) # Example data q <- 10 n <- 1000 p <- 10 Y <- mvrnorm(n, mu = rep(0, q), Sigma = diag(q)) X <- as.data.frame(mvrnorm(n, mu = rnorm(p), Sigma = diag(p))) # Fit lm fit <- lm( Y ~ ., data = X ) # Type I sums of squares summary(manova(fit))$SS # Type III sums of squares type = 3 #

Correcting dfs when using sample weights with lm

不想你离开。 提交于 2021-01-05 07:21:46
问题 I was trying to figure out how weighting in lm actually worked and I saw this 7,5 year old question which gives some insight in how weights work. The data from this question is partly copied and expanded on below. I posted this related question, on Cross Validated. library(plyr) set.seed(100) df <- data.frame(uid=1:200, bp=sample(x=c(100:200),size=200,replace=TRUE), age=sample(x=c(30:65),size=200,replace=TRUE), weight=sample(c(1:10),size=200,replace=TRUE), stringsAsFactors=FALSE) set.seed(100

Performing a linear model in R of a single response with a single predictor from a large dataframe and repeat for each column

99封情书 提交于 2020-12-15 01:47:19
问题 It might not be very clear from the title but what I wish to do is: I have a dataframe df with, say, 200 columns and the first 80 columns are response variables (y1, y2, y3, ...) and the rest of 120 are predictors (x1, x2, x3, ...). I wish to compute a linear model for each pair – lm(yi ~ xi, data = df) . Many problems and solutions I have looked through online have a either a fixed response vs many predictors or the other way around, using lapply() and its related functions. Could anyone who

Performing a linear model in R of a single response with a single predictor from a large dataframe and repeat for each column

血红的双手。 提交于 2020-12-15 01:44:08
问题 It might not be very clear from the title but what I wish to do is: I have a dataframe df with, say, 200 columns and the first 80 columns are response variables (y1, y2, y3, ...) and the rest of 120 are predictors (x1, x2, x3, ...). I wish to compute a linear model for each pair – lm(yi ~ xi, data = df) . Many problems and solutions I have looked through online have a either a fixed response vs many predictors or the other way around, using lapply() and its related functions. Could anyone who

How to pass string formula to R's lm and see the formula in the summary?

泪湿孤枕 提交于 2020-12-13 03:50:47
问题 In the R session below, summary(model) shows the formula as model_str . How do I get it to show as mpg ~ cyl + hp while still being able to set the model formula via a string? > data(mtcars) > names(mtcars) [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb" > model_str <- 'mpg ~ cyl + hp' > model <- lm(model_str, data=mtcars) > summary(model) Call: lm(formula = model_str, data = mtcars) Residuals: Min 1Q Median 3Q Max -4.4948 -2.4901 -0.1828 1.9777 7.2934 Coefficients:

How to pass string formula to R's lm and see the formula in the summary?

匆匆过客 提交于 2020-12-13 03:46:04
问题 In the R session below, summary(model) shows the formula as model_str . How do I get it to show as mpg ~ cyl + hp while still being able to set the model formula via a string? > data(mtcars) > names(mtcars) [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb" > model_str <- 'mpg ~ cyl + hp' > model <- lm(model_str, data=mtcars) > summary(model) Call: lm(formula = model_str, data = mtcars) Residuals: Min 1Q Median 3Q Max -4.4948 -2.4901 -0.1828 1.9777 7.2934 Coefficients:

How can I add stars to broom package's tidy() function output?

旧巷老猫 提交于 2020-11-27 01:53:35
问题 I have been using the broom package's tidy() function in R to print my model summaries. However, the tidy() function returns p-values without stars, which makes it a bit weird for many people who are used to seeing stars in model summaries. Does anyone know a way to add stars to the output? 回答1: We can use a convenient function stars.pval from gtools to do this library(gtools) library(broom) library(dplyr) data(mtcars) mtcars %>% lm(mpg ~ wt + qsec, .) %>% tidy %>% mutate(signif = stars.pval

How can I add stars to broom package's tidy() function output?

喜夏-厌秋 提交于 2020-11-27 01:53:28
问题 I have been using the broom package's tidy() function in R to print my model summaries. However, the tidy() function returns p-values without stars, which makes it a bit weird for many people who are used to seeing stars in model summaries. Does anyone know a way to add stars to the output? 回答1: We can use a convenient function stars.pval from gtools to do this library(gtools) library(broom) library(dplyr) data(mtcars) mtcars %>% lm(mpg ~ wt + qsec, .) %>% tidy %>% mutate(signif = stars.pval