Linear Regression and group by in R

前端未结

关注

 10  1329

I want to do a linear regression in R using the lm() function. My data is an annual time series with one field for year (22 years) and another for state (50 sta

相关标签:

10条回答

故里飘歌

2020-11-22 02:55

Here's an approach using the plyr package:

d <- data.frame(
  state = rep(c('NY', 'CA'), 10),
  year = rep(1:10, 2),
  response= rnorm(20)
)

library(plyr)
# Break up d by state, then fit the specified model to each piece and
# return a list
models <- dlply(d, "state", function(df) 
  lm(response ~ year, data = df))

# Apply coef to each model and return a data frame
ldply(models, coef)

# Print the summary of each model
l_ply(models, summary, .print = TRUE)

0 讨论(0)

忘了有多久

2020-11-22 02:55
In my opinion is a mixed linear model a better approach for this kind of data. The code below given in the fixed effect the overall trend. The random effects indicate how the trend for each individual state differ from the global trend. The correlation structure takes the temporal autocorrelation into account. Have a look at Pinheiro & Bates (Mixed Effects Models in S and S-Plus).
```
library(nlme)
lme(response ~ year, random = ~year|state, correlation = corAR1(~year))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
情话喂你

2020-11-22 02:55
The lm() function above is an simple example. By the way, I imagine that your database has the columns as in the following form:

year state var1 var2 y...

In my point of view, you can to use the following code:
```
require(base) 
library(base) 
attach(data) # data = your data base
             #state is your label for the states column
modell<-by(data, data$state, function(data) lm(y~I(1/var1)+I(1/var2)))
summary(modell)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

眼角桃花

2020-11-22 02:56

A nice solution using data.table was posted here in CrossValidated by @Zach. I'd just add that it is possible to obtain iteratively also the regression coefficient r^2:

## make fake data
    library(data.table)
    set.seed(1)
    dat <- data.table(x=runif(100), y=runif(100), grp=rep(1:2,50))

##calculate the regression coefficient r^2
    dat[,summary(lm(y~x))$r.squared,by=grp]
       grp         V1
    1:   1 0.01465726
    2:   2 0.02256595

as well as all the other output from summary(lm):

dat[,list(r2=summary(lm(y~x))$r.squared , f=summary(lm(y~x))$fstatistic[1] ),by=grp]
   grp         r2        f
1:   1 0.01465726 0.714014
2:   2 0.02256595 1.108173

0 讨论(0)

上一页 1 2