lm | 易学教程

Understanding lm and environment

阅读更多关于 Understanding lm and environment

I'm executing lm() with arguments formula , data , na.action , and weights . My weights are stored in a numeric variable. When I specify formula as a character (i.e. formula = "Response~0+." ), I get an error that weights is not of the proper length (even though it is). When I specify formula without the quotes (i.e. formula = Response~0+. ), the function works fine. I stumbled upon this sentence in the lm() documentation: All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula. This is difficult for

How to interpret lm() coefficient estimates when using bs() function for splines

阅读更多关于 How to interpret lm() coefficient estimates when using bs() function for splines

I'm using a set of points which go from (-5,5) to (0,0) and (5,5) in a "symmetric V-shape". I'm fitting a model with lm() and the bs() function to fit a "V-shape" spline: lm(formula = y ~ bs(x, degree = 1, knots = c(0))) I get the "V-shape" when I predict outcomes by predict() and draw the prediction line. But when I look at the model estimates coef() , I see estimates that I don't expect. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.93821 0.16117 30.639 1.40e-09 *** bs(x, degree = 1, knots = c(0))1 -5.12079 0.24026 -21.313 2.47e-08 *** bs(x, degree = 1, knots = c(0))2 -0

Adding lagged variables to an lm model?

阅读更多关于 Adding lagged variables to an lm model?

I'm using lm on a time series, which works quite well actually, and it's super super fast. Let's say my model is: > formula <- y ~ x I train this on a training set: > train <- data.frame( x = seq(1,3), y = c(2,1,4) ) > model <- lm( formula, train ) ... and I can make predictions for new data: > test <- data.frame( x = seq(4,6) ) > test$y <- predict( model, newdata = test ) > test x y 1 4 4.333333 2 5 5.333333 3 6 6.333333 This works super nicely, and it's really speedy. I want to add lagged variables to the model. Now, I could do this by augmenting my original training set: > train$y_1 <- c(0

convert string back into object in r [closed]

阅读更多关于 convert string back into object in r [closed]

In R, I get the content of a binary file as a string (because of design issues, I can't access the file directly). This file was originally an lm model. How do I convert that string back into the lm model? Thanks I'm assuming you used base::dput() according to the following example (based on this answer ): # Generate some model over some data data <- sample(1:100, 30) df <- data.frame(x = data, y = 2 * data + 20) model <- lm(y ~ x, df) # Assuming this is what you did you have the model structure inside model.R dput(model, control = c("quoteExpressions", "showAttributes"), file = "model.R") # -

Linear regression in R with if statement [duplicate]

阅读更多关于 Linear regression in R with if statement [duplicate]

This question already has an answer here: How to run linear model in R with certain data range? 1 answer I have a dummy variable black where black==0 is White and black==1 is Black. I am trying to fit a linear model lm for the black==1 category only, however running the code below gives me the incorrect coefficients. Is there a way in R to run a model with the if statement, similar to Stata? library(foreign) df<-read.dta("hw4.dta") attach(df) black[black==0]<-NA model3<-lm(rent~I(income^2)+income+black) If looks like there are a few issues here. First, you've stored all your data in separate

Fitting linear model / ANOVA by group [duplicate]

阅读更多关于 Fitting linear model / ANOVA by group [duplicate]

This question already has an answer here: Linear Regression and group by in R 10 answers I'm trying to run anova() in R and running into some difficulty. This is what I've done up to now to help shed some light on my question. Here is the str() of my data to this point. str(mhw) 'data.frame': 500 obs. of 5 variables: $ r : int 1 2 3 4 5 6 7 8 9 10 ... $ c : int 1 1 1 1 1 1 1 1 1 1 ... $ grain: num 3.63 4.07 4.51 3.9 3.63 3.16 3.18 3.42 3.97 3.4 ... $ straw: num 6.37 6.24 7.05 6.91 5.93 5.59 5.32 5.52 6.03 5.66 ... $ Quad : Factor w/ 4 levels "NE","NW","SE",..: 2 2 2 2 2 2 2 2 2 2 ... Column r

Print R-squared for all of the models fit with lmList

阅读更多关于 Print R-squared for all of the models fit with lmList

I used lmList to fit 480 relationships and I would like the R2 of each of these. Here is an example dataset and model which are pretty close to what it really looks like, except I have 480 eu (experimental units): eu mass day 11 .02 1 11 .03 2 11 .04 3 11 .06 4 12 .01 1 12 .03 2 12 .04 3 12 .05 4 fit<-lmList(mass ~ day | eu, data=df) Printing fit or summary does not give me the information I want. I am ultimately trying to make a new dataframe that will look like: eu intercept slope R2 11 .01 .95 .98 12 .01 .96 .98 I've got the coefficients through coef , now I need the R-squared. Here you go:

Why does lm run out of memory while matrix multiplication works fine for coefficients?

阅读更多关于 Why does lm run out of memory while matrix multiplication works fine for coefficients?

I am trying to do fixed effects linear regression with R. My data looks like dte yr id v1 v2 . . . . . . . . . . . . . . . I then decided to simply do this by making yr a factor and use lm : lm(v1 ~ factor(yr) + v2 - 1, data = df) However, this seems to run out of memory. I have 20 levels in my factor and df is 14 million rows which takes about 2GB to store, I am running this on a machine with 22 GB dedicated to this process. I then decided to try things the old fashioned way: create dummy variables for each of my years t1 to t20 by doing: df$t1 <- 1*(df$yr==1) df$t2 <- 1*(df$yr==2) df$t3 <- 1

Create and Call Linear Models from List

阅读更多关于 Create and Call Linear Models from List

So I'm trying to compare different linear models in order to determine if one is better than another. However I have several models, so I want to create an list of models and then call on them. Is that possible? Models <- list(lm(y~a),lm(y~b),lm(y~c) Models2 <- list(lm(y~a+b),lm(y~a+c),lm(y~b+c)) anova(Models2[1],Models[1]) Thank you for your help! If you have two lists of models, and you want to compare each pair of models, then you want Map : models1 <- list(lm(y ~ a), lm(y ~ b), lm(y ~ c) models2 <- list(lm(y ~ a + b), lm(y ~ a + c), lm(y ~ b + c)) Map(anova, models1, models2) This is

Linear Regression and storing results in data frame [duplicate]

阅读更多关于 Linear Regression and storing results in data frame [duplicate]

This question already has an answer here: Linear Regression and group by in R 10 answers I am running a linear regression on some variables in a data frame. I'd like to be able to subset the linear regressions by a categorical variable, run the linear regression for each categorical variable, and then store the t-stats in a data frame. I'd like to do this without a loop if possible. Here's a sample of what I'm trying to do: a<- c("a","a","a","a","a", "b","b","b","b","b", "c","c","c","c","c") b<- c(0.1,0.2,0.3,0.2,0.3, 0.1,0.2,0.3,0.2,0.3, 0.1,0.2,0.3,0.2,0.3) c<- c(0.2,0.1,0.3,0.2,0.4, 0.2,0.5