panel-data

Taking a 3 year average across in a panel data set with NAs

孤人 提交于 2020-01-03 05:03:12
问题 I have the following dataframe, called DF, Country Year Var1 Var2 USA 2010 5 3 USA 2011 6 5 USA 2012 NA 8 USA 2013 4 NA USA 2014 NA 6 USA 2015 6 9 CHN 2010 NA 5 CHN 2011 7 NA CHN 2012 6 NA CHN 2013 4 4 CHN 2014 NA 6 CHN 2015 NA 8 EGY 2010 3 NA EGY 2011 3 5 EGY 2012 3 6 EGY 2013 NA 8 EGY 2014 NA NA EGY 2015 NA 2 I want to take a 3 year average of the data. However, if there are only two years of available data within a particular three year interval, I want to ignore the NA and take a two year

Residuals from first differenced regression on unbalanced panel

假装没事ソ 提交于 2020-01-01 18:55:46
问题 I am trying to use plm to estimate a first differenced model on some unbalanced panel data. My model seems to work and I get coefficient estimates, but I want to know if there is a way to get the residual (or fitted value) per observation used. I have run into two problems, I don't know how to attach residuals to the observation they are associated with, and I seem to get an incorrect number of residuals. If I retrieve the residuals from the estimated model using model.name$residuals, I get a

Residuals from first differenced regression on unbalanced panel

别来无恙 提交于 2020-01-01 18:55:13
问题 I am trying to use plm to estimate a first differenced model on some unbalanced panel data. My model seems to work and I get coefficient estimates, but I want to know if there is a way to get the residual (or fitted value) per observation used. I have run into two problems, I don't know how to attach residuals to the observation they are associated with, and I seem to get an incorrect number of residuals. If I retrieve the residuals from the estimated model using model.name$residuals, I get a

Efficient calculation of var-covar matrix in R

时光毁灭记忆、已成空白 提交于 2019-12-31 21:58:31
问题 I'm looking for efficiency gains in calculating the (auto)covariance matrix from individual measurements over time t with t, t-1 , etc.. In the data matrix, each row represents an individual and each column represents monthly measurements (the columns are in time order). Similar to the following data (although with some more co-variance). # simulate data set.seed(1) periods <- 70L ind <- 90000L mat <- sapply(rep(ind, periods), rnorm) Below is the (ugly) code I came up with to get the

Create lagged variable in unbalanced panel data in R

醉酒当歌 提交于 2019-12-28 12:04:12
问题 I'd like to create a variable containing the value of a variable in the previous year within a group. id date value 1 1 1992 4.1 2 1 NA 4.5 3 1 1991 3.3 4 1 1990 5.3 5 1 1994 3.0 6 2 1992 3.2 7 2 1991 5.2 value_lagged should be missing when the previous year is missing within a group - either because it is the first date within a group (as in row 4, 7), or because there are year gaps in the data (as in row 5). Also, value_lagged should be missing when the current time is missing (as in row 2)

R Ordering of LME covariates for level 1 and level 2 variables?

半世苍凉 提交于 2019-12-25 09:00:33
问题 I have longitudinal data with level 1 and level 2 variables in R my dataframe (df) : ID Year Gender Race MathScore DepressionScore MemoryScore 1 1999 M C 80 15 80 1 2000 M C 81 25 60 1 2001 M C 70 50 75 2 1999 F C 65 15 99 2 2000 F C 70 31 98 2 2001 F C 71 30 99 3 1999 F AA 92 10 90 3 2000 F AA 89 10 91 3 2001 F AA 85 26 80 I've tried these: summary(fix <- lme(MathScore ~ Gender+Race+DepressionScore+MemoryScore, random= Year|ID, data=df, na.action="na.omit") summary(fix2 <- lme(MathScore ~ 1

R: No way to get double-clustered standard errors for an object of class “c('pmg', 'panelmodel')”?

一世执手 提交于 2019-12-24 13:22:29
问题 I am estimating Fama-Macbeth regression. I have taken the code from this site fpmg <- pmg(Mumbo~Jumbo, test, index=c("year","firmid")) summary(fpmg) Mean Groups model Call: pmg(formula = Mumbo ~ Jumbo, data = superfdf, index = c("day","Firm")) Residuals Min. 1st Qu. Median Mean 3rd Qu. Max. -0.142200 -0.006930 0.000000 0.000000 0.006093 0.142900 Coefficients Estimate Std. Error z-value Pr(>|z|) (Intercept) -3.0114e-03 3.7080e-03 -0.8121 0.4167 Jumbo 4.9434e-05 3.4309e-04 0.1441 0.8854 Total

Double clustered standard errors for panel data

戏子无情 提交于 2019-12-17 18:33:33
问题 I have a panel data set in R (time and cross section) and would like to compute standard errors that are clustered by two dimensions, because my residuals are correlated both ways. Googling around I found http://thetarzan.wordpress.com/2011/06/11/clustered-standard-errors-in-r/ which provides a function to do this. It seems a bit ad-hoc so I wanted to know if there is a package that has been tested and does this? I know sandwich does HAC standard errors, but it doesn't do double clustering (i

How to determine (complex) panel pattern?

匆匆过客 提交于 2019-12-13 02:22:04
问题 My question is closely related to existing discussions on Statalist like this one. I want to raise a new question because I want to look at more complex patterns of panels beyond numbers of consecutive spells. Say, given a panel of firms, I want to check how many years that firm owns no real estate property land == 0 before it buys some land > 0 . Or, even more sophisticatedly, how many years the firm's property is below some level land < 0.05 * land[s] where s refers to the year a firm

subsetting Panel Data conditional on consecutive strings of length

懵懂的女人 提交于 2019-12-11 14:33:13
问题 I'm stuck trying to subset some panel data, i.e. ids within group, using dplyr . I want to exact all id s, within each group, grp that has a NUM series with a minimum smaller than 2 and a maximum greater than 2. I've constructed a minimal working example below that should illustrate the issue. I have been working with filter() , row_number() == c(1,n()) , and tried to separate it out and merge, i.e. different types of _join , it back together, but I am stuck and I am now turning to the SO