panel-data | 易学教程

Taking a 3 year average across in a panel data set with NAs

阅读更多关于 Taking a 3 year average across in a panel data set with NAs

问题 I have the following dataframe, called DF, Country Year Var1 Var2 USA 2010 5 3 USA 2011 6 5 USA 2012 NA 8 USA 2013 4 NA USA 2014 NA 6 USA 2015 6 9 CHN 2010 NA 5 CHN 2011 7 NA CHN 2012 6 NA CHN 2013 4 4 CHN 2014 NA 6 CHN 2015 NA 8 EGY 2010 3 NA EGY 2011 3 5 EGY 2012 3 6 EGY 2013 NA 8 EGY 2014 NA NA EGY 2015 NA 2 I want to take a 3 year average of the data. However, if there are only two years of available data within a particular three year interval, I want to ignore the NA and take a two year

Residuals from first differenced regression on unbalanced panel

阅读更多关于 Residuals from first differenced regression on unbalanced panel

问题 I am trying to use plm to estimate a first differenced model on some unbalanced panel data. My model seems to work and I get coefficient estimates, but I want to know if there is a way to get the residual (or fitted value) per observation used. I have run into two problems, I don't know how to attach residuals to the observation they are associated with, and I seem to get an incorrect number of residuals. If I retrieve the residuals from the estimated model using model.name$residuals, I get a

Residuals from first differenced regression on unbalanced panel

阅读更多关于 Residuals from first differenced regression on unbalanced panel

Efficient calculation of var-covar matrix in R

阅读更多关于 Efficient calculation of var-covar matrix in R

问题 I'm looking for efficiency gains in calculating the (auto)covariance matrix from individual measurements over time t with t, t-1 , etc.. In the data matrix, each row represents an individual and each column represents monthly measurements (the columns are in time order). Similar to the following data (although with some more co-variance). # simulate data set.seed(1) periods <- 70L ind <- 90000L mat <- sapply(rep(ind, periods), rnorm) Below is the (ugly) code I came up with to get the

Create lagged variable in unbalanced panel data in R

阅读更多关于 Create lagged variable in unbalanced panel data in R

问题 I'd like to create a variable containing the value of a variable in the previous year within a group. id date value 1 1 1992 4.1 2 1 NA 4.5 3 1 1991 3.3 4 1 1990 5.3 5 1 1994 3.0 6 2 1992 3.2 7 2 1991 5.2 value_lagged should be missing when the previous year is missing within a group - either because it is the first date within a group (as in row 4, 7), or because there are year gaps in the data (as in row 5). Also, value_lagged should be missing when the current time is missing (as in row 2)

R Ordering of LME covariates for level 1 and level 2 variables?

阅读更多关于 R Ordering of LME covariates for level 1 and level 2 variables?

问题 I have longitudinal data with level 1 and level 2 variables in R my dataframe (df) : ID Year Gender Race MathScore DepressionScore MemoryScore 1 1999 M C 80 15 80 1 2000 M C 81 25 60 1 2001 M C 70 50 75 2 1999 F C 65 15 99 2 2000 F C 70 31 98 2 2001 F C 71 30 99 3 1999 F AA 92 10 90 3 2000 F AA 89 10 91 3 2001 F AA 85 26 80 I've tried these: summary(fix <- lme(MathScore ~ Gender+Race+DepressionScore+MemoryScore, random= Year|ID, data=df, na.action="na.omit") summary(fix2 <- lme(MathScore ~ 1

R: No way to get double-clustered standard errors for an object of class “c('pmg', 'panelmodel')”?

阅读更多关于 R: No way to get double-clustered standard errors for an object of class “c('pmg', 'panelmodel')”?

问题 I am estimating Fama-Macbeth regression. I have taken the code from this site fpmg <- pmg(Mumbo~Jumbo, test, index=c("year","firmid")) summary(fpmg) Mean Groups model Call: pmg(formula = Mumbo ~ Jumbo, data = superfdf, index = c("day","Firm")) Residuals Min. 1st Qu. Median Mean 3rd Qu. Max. -0.142200 -0.006930 0.000000 0.000000 0.006093 0.142900 Coefficients Estimate Std. Error z-value Pr(>|z|) (Intercept) -3.0114e-03 3.7080e-03 -0.8121 0.4167 Jumbo 4.9434e-05 3.4309e-04 0.1441 0.8854 Total

Double clustered standard errors for panel data

阅读更多关于 Double clustered standard errors for panel data

问题 I have a panel data set in R (time and cross section) and would like to compute standard errors that are clustered by two dimensions, because my residuals are correlated both ways. Googling around I found http://thetarzan.wordpress.com/2011/06/11/clustered-standard-errors-in-r/ which provides a function to do this. It seems a bit ad-hoc so I wanted to know if there is a package that has been tested and does this? I know sandwich does HAC standard errors, but it doesn't do double clustering (i

How to determine (complex) panel pattern?

阅读更多关于 How to determine (complex) panel pattern?

问题 My question is closely related to existing discussions on Statalist like this one. I want to raise a new question because I want to look at more complex patterns of panels beyond numbers of consecutive spells. Say, given a panel of firms, I want to check how many years that firm owns no real estate property land == 0 before it buys some land > 0 . Or, even more sophisticatedly, how many years the firm's property is below some level land < 0.05 * land[s] where s refers to the year a firm

subsetting Panel Data conditional on consecutive strings of length

阅读更多关于 subsetting Panel Data conditional on consecutive strings of length

问题 I'm stuck trying to subset some panel data, i.e. ids within group, using dplyr . I want to exact all id s, within each group, grp that has a NUM series with a minimum smaller than 2 and a maximum greater than 2. I've constructed a minimal working example below that should illustrate the issue. I have been working with filter() , row_number() == c(1,n()) , and tried to separate it out and merge, i.e. different types of _join , it back together, but I am stuck and I am now turning to the SO