panel-data | 易学教程

Simple moving average on an unbalanced panel in R

阅读更多关于 Simple moving average on an unbalanced panel in R

问题 I am working with an unbalanced, irregularly spaced cross-sectional time series. My goal is to obtain a lagged moving average vector for the "Quantity" vector, segmented by "Subject". In other words, say the the the following Quanatities have been observed for Subject_1: [1,2,3,4,5]. I first need to lag it by 1, yielding [NA,1,2,3,4]. Then I need to take a moving average of order 3, yielding [NA,NA,NA,(3+2+1)/3,(4+3+2)/3] The above needs to be done for all Subjects. # Construct example

R - Using data.table to efficiently test rolling conditions across multiple rows and columns

阅读更多关于 R - Using data.table to efficiently test rolling conditions across multiple rows and columns

I am trying to test a variety of conditions in a data.table that looks like this reproducible example set.seed(17) year <- 1980 + rnbinom(10000,3,0.35) event <- rep(LETTERS, length.out=10000) z <- as.integer(runif(10000,min = 0, max = 10)) dt <- data.table(event,year,z) setkey(dt, event,year) dt <- dt[,sum(z), by=c("event","year")] V1 (which emerges from the last command) represents a count of event occurences. So the data table is an ordered array and I need to execute a variety of functions on it. Here are some examples: How do I calculate a rolling sum (or rolling mean) of the occurences in

How to get the difference in value between subsequent observations (country-years)?

阅读更多关于 How to get the difference in value between subsequent observations (country-years)?

问题 Let's say, I have scores for 5 countries over a period of 10 years such as: mydata<-1:3 mydata<-expand.grid( country=c('A', 'B', 'C', 'D', 'E'), year=c('1980','1981','1982','1983','1984','1985','1986','1987','1988','1989')) mydata$score=sapply(runif(50,0,2), function(x) {round(x,4)}) library(reshape) mydata<-reshape(mydata, v.names="score", idvar="year", timevar="country", direction="wide") > head(mydata) year score.A score.B score.C score.D score.E 1 1980 1.0538 1.6921 1.3165 1.7434 1.9687 6

Hausman type test in R

阅读更多关于 Hausman type test in R

问题 I have been using " plm " package of R to do the analysis of panel data. One of the important test in this package for choosing between "fixed effect" or "random effect" model is called Hausman type . A similar test is also available for the Stata. The point here is that Stata requires fixed effect to be estimated first followed by random effect. However, I didn't see any such restriction in the "plm" package. So, I was wondering whether " plm " package has the default "fixed effect" first

Generating a lagged time series cross sectional variable in R

阅读更多关于 Generating a lagged time series cross sectional variable in R

问题 I am a new R user. I have a time series cross sectional dataset and, although I have found ways to lag time series data in R, I have not found a way to create lagged time-series cross sectional variables so that I can use them in my analysis. 回答1: Here's how you could use the lag() function with zoo (and panel series data): > library(plm) > library(zoo) > data("Produc") > dnow <- pdata.frame(Produc) > x.Date <- as.Date(paste(rownames(t(as.matrix(dnow$pcap))), "-01-01", sep="")) > x <- zoo(t

R plm lag - what is the equivalent to L1.x in Stata?

阅读更多关于 R plm lag - what is the equivalent to L1.x in Stata?

问题 Using the plm package in R to fit a fixed-effects model, what is the correct syntax to add a lagged variable to the model? Similar to the 'L1.variable' command in Stata. Here is my attempt adding a lagged variable (this is a test model and it might not make sense): library(foreign) nlswork <- read.dta("http://www.stata-press.com/data/r11/nlswork.dta") pnlswork <- plm.data(nlswork, c('idcode', 'year')) ffe <- plm(ln_wage ~ ttl_exp+lag(wks_work,1) , model = 'within' , data = nlswork) summary

Difference in Differences in Python + Pandas

阅读更多关于 Difference in Differences in Python + Pandas

问题 I'm trying to perform a Difference in Differences (with panel data and fixed effects) analysis using Python and Pandas. I have no background in Economics and I'm just trying to filter the data and run the method that I was told to. However, as far as I could learn, I understood that the basic diff-in-diffs model looks like this: I.e., I am dealing with a multivariable model. Here it follows a simple example in R: https://thetarzan.wordpress.com/2011/06/20/differences-in-differences-estimation

Hausman type test in R

阅读更多关于 Hausman type test in R

I have been using " plm " package of R to do the analysis of panel data. One of the important test in this package for choosing between "fixed effect" or "random effect" model is called Hausman type . A similar test is also available for the Stata. The point here is that Stata requires fixed effect to be estimated first followed by random effect. However, I didn't see any such restriction in the "plm" package. So, I was wondering whether " plm " package has the default "fixed effect" first and then "random effect" second. For your reference, I mention below the steps in Stata and R that I

How to get the difference in value between subsequent observations (country-years)?

阅读更多关于 How to get the difference in value between subsequent observations (country-years)?

Let's say, I have scores for 5 countries over a period of 10 years such as: mydata<-1:3 mydata<-expand.grid( country=c('A', 'B', 'C', 'D', 'E'), year=c('1980','1981','1982','1983','1984','1985','1986','1987','1988','1989')) mydata$score=sapply(runif(50,0,2), function(x) {round(x,4)}) library(reshape) mydata<-reshape(mydata, v.names="score", idvar="year", timevar="country", direction="wide") > head(mydata) year score.A score.B score.C score.D score.E 1 1980 1.0538 1.6921 1.3165 1.7434 1.9687 6 1981 1.4773 1.6479 0.3135 0.6172 0.7704 11 1982 0.8748 1.3704 0.2788 1.6306 1.7237 16 1983 1.1224 1

Generating a lagged time series cross sectional variable in R

阅读更多关于 Generating a lagged time series cross sectional variable in R

I am a new R user. I have a time series cross sectional dataset and, although I have found ways to lag time series data in R, I have not found a way to create lagged time-series cross sectional variables so that I can use them in my analysis. Here's how you could use the lag() function with zoo (and panel series data): > library(plm) > library(zoo) > data("Produc") > dnow <- pdata.frame(Produc) > x.Date <- as.Date(paste(rownames(t(as.matrix(dnow$pcap))), "-01-01", sep="")) > x <- zoo(t(as.matrix(dnow$pcap)), x.Date) > x[1:3,1:3] ALABAMA ARIZONA ARKANSAS 1970-01-01 15032.67 10148.42 7613.26