lm

Incorrect abline line for a regression model with intercept in R

无人久伴 提交于 2019-12-02 03:16:44
(reproducible example given) In the following, I get an abline line with y-intercept is about 30, but the regression says y-intercept should be 37.2851 Where am I wrong? mtcars$mpg # 21.0 21.0 22.8 ... 21.4 (32 obs) mtcars$wt # 2.620 2.875 2.320 ... 2.780 (32 obs) regression1 <- lm(mtcars$mpg ~ mtcars$wt) coef(regression1) # mpg ~ 37.2851 - 5.3445wt plot(mtcars$mpg ~ mtcars$wt, pch=19, col='gray50') # pch: shape of points abline(h=mean(mtcars$mpg), lwd=2, col ='darkorange') # The y-coordinate of hor'l line: 20,09062 abline(lm(mtcars$mpg ~ mtcars$wt), lwd=2, col ='sienna') I looked at all the

Interpreting interactions in a regression model

走远了吗. 提交于 2019-12-02 02:41:41
问题 A simple question I hope. I have an experimental design where I measure some response (let's say blood pressure) from two groups: a control group and an affected group, where both are given three treatments: t1, t2, t3. The data are not paired in any sense. Here is an example data: set.seed(1) df <- data.frame(response = c(rnorm(5,10,1),rnorm(5,10,1),rnorm(5,10,1), rnorm(5,7,1),rnorm(5,5,1),rnorm(5,10,1)), group = as.factor(c(rep("control",15),rep("affected",15))), treatment = as.factor(rep(c

Exporting R lm regression with cbind to text

江枫思渺然 提交于 2019-12-02 01:59:39
I have the following lm with a vector of dependent variables: > fit<-lm(cbind(X1m, X3m, X6m, X1y, X2y, X3y, X5y, X6y, X10y, X20y, X30y) ~ (ff + dc), data = yields) When attempting to export the entire output to csv, I get this error: write.csv(as.data.frame(summary(fit)), file="regression1.csv") Error in as.data.frame.default(summary(fit)) : cannot coerce class ""listof"" to a data.frame If I export just coefficients, everything works fine: write.csv(as.data.frame(coef(fit)), file="regression1.csv") I would like, however, to have t-statistics and standard errors along with my coefficients. I

R: Multiple Linear Regression with a specific range of variables [duplicate]

放肆的年华 提交于 2019-12-02 01:58:20
This question already has an answer here: short formula call for many variables when building a model [duplicate] 2 answers It appears simple, but I don't know how to code it in R. I have a dataframe (df) with ~100 variables, and I would like to do a multiple regression between the response which is my First variable (Y) and the variables 25 to 60 as regressors. The problem is that I don't want to write each variable name like: lm(Y~var25+var26+.......var60, data=df) I would like to use something like [, 25:60] to select a complete range. I have tried it but doesn't works: test <- lm(Y~df[, 25

Interpreting interactions in a regression model

杀马特。学长 韩版系。学妹 提交于 2019-12-02 01:35:31
A simple question I hope. I have an experimental design where I measure some response (let's say blood pressure) from two groups: a control group and an affected group, where both are given three treatments: t1, t2, t3. The data are not paired in any sense. Here is an example data: set.seed(1) df <- data.frame(response = c(rnorm(5,10,1),rnorm(5,10,1),rnorm(5,10,1), rnorm(5,7,1),rnorm(5,5,1),rnorm(5,10,1)), group = as.factor(c(rep("control",15),rep("affected",15))), treatment = as.factor(rep(c(rep("t1",5),rep("t2",5),rep("t3",5)),2))) What I am interested in is quantifying the effect that each

R regression analysis: analyzing data for a certain ethnicity

亡梦爱人 提交于 2019-12-02 01:25:58
问题 I have a data set that investigate depression among individuals with different ethnicities (Black, White, and Latina). I want to know how depression at baseline relates to depression at post with all ethnic groups, I did lm(depression_base ~ depression_post, data=Data Now, I want to look at the relationship by ethnicity. Ethnicity in my dataset is coded as 0 = White , 1 = Black , and 2 = Latina . I am thinking that I need to use the ifelse function, but I cannot seem to get it to work. Here

Table of multiple lm() models using apsrtable in Rmarkdown

故事扮演 提交于 2019-12-02 01:16:27
Goal Present the results of multiple models, created using the lm() function, together in a nicely-formatted table. This table will be generated in a .Rmd file and output to a PDF document. Proposed Solution In Reproducible Research with R and RStudio , there is an example using the apsrtable() function to display multiple models side-by-side. This book provides the following code (p. 173-174): Code \begin{table} \caption{Example Nested Estimates Table with \emph{aprstable}} \label{BasicApsrTableExample} \begin{center} <<results= asis , echo=FALSE>>= # Load apsrtable package library(apsrtable)

R: build separate models for each category

北战南征 提交于 2019-12-02 00:53:22
Short version : How to build separate models for each category (without splitting the data). (I am new to R) Long version: consider the following synthetic data housetype,ht1,ht2,age,price O,0,1,1,1000 O,0,1,2,2000 O,0,1,3,3000 N,1,0,1,10000 N,1,0,2,20000 N,1,0,3,30000 We can model the above using two separate models if(housetype=='o') price = 1000 * age else price = 10000 * age i.e. a separate model based on category type? This is what I have tried model=lm(price~housetype+age, data=datavar) and model=lm(price~ht1+ht2+age, data = datavar) Both the above models (which is essentially the same)

Why predicted polynomial changes drastically when only the resolution of prediction grid changes?

ε祈祈猫儿з 提交于 2019-12-02 00:28:06
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 3 years ago . Why I have the exact same model, but run predictions on different grid size (by 0.001 vs by 0.01) getting different predictions? set.seed(0) n_data=2000 x=runif(n_data)-0.5 y=0.1*sin(x*30)/x+runif(n_data) plot(x,y) poly_df=5 x_exp=as.data.frame(cbind(y,poly(x, poly_df))) fit=lm(y~.,data=x_exp) x_plt1=seq(-1,1,0.001) x_plt_exp1=as.data.frame(poly(x_plt1,poly_df)) lines(x_plt1

Coefficient table does not have NA rows in rank-deficient fit; how to insert them?

天大地大妈咪最大 提交于 2019-12-01 23:23:53
library(lmPerm) x <- lmp(formula = a ~ b * c + d + e, data = df, perm = "Prob") summary(x) # truncated output, I can see `NA` rows here! #Coefficients: (1 not defined because of singularities) # Estimate Iter Pr(Prob) #b 5.874 51 1.000 #c -30.060 281 0.263 #b:c NA NA NA #d1 -31.333 60 0.633 #d2 33.297 165 0.382 #d3 -19.096 51 1.000 #e 1.976 NA NA I want to pull out the Pr(Prob) results for everything, but y <- summary(x)$coef[, "Pr(Prob)"] #(Intercept) b c d1 d2 # 0.09459459 1.00000000 0.26334520 0.63333333 0.38181818 # d3 e # 1.00000000 NA This is not what I want. I need b:c row, too, in the