Convert from annual to quarterly data, constrained to annual average

后端未结

关注

 3  2114

I have several variables at annual frequency in R that I would like to include in a regression analysis with other variables available at quarterly frequency. Additionally,

相关标签:

3条回答

佛祖请我去吃肉

2021-01-06 08:44
A bit late here, but the tempdisagg package does what you want. It ensures that either the sum, the average, the first or the last value of the resulting high frequency series is consistent with the low frequency series.

It also allows you to use external indicator series, e.g., by the Chow-Lin technique. If you don't have it, the Denton-Cholette method produces a better result than the method in Eviews.

Here's your example:
```
# need ts object as input
z_a <- ts(c(100, 110, 111), start = 2000)

library(tempdisagg)
z_q <- predict(td(z_a ~ 1, method = "denton-cholette", conversion = "average"))

z_q
#           Qtr1      Qtr2      Qtr3      Qtr4
# 2000  97.65795  98.59477 100.46841 103.27887
# 2001 107.02614 109.71460 111.34423 111.91503
# 2002 111.42702 111.06100 110.81699 110.69499

# which has the same means as your original series:

tapply(z_q, floor(time(z_q)), mean)
# 2000 2001 2002 
#  100  110  111 
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
梦毁少年i

2021-01-06 08:46
We could manipulate the output of na.spline to ensure that it averages to the annual values by shifting the 4 quarters' values or shifting the last 3 quarters' values. In the first case we would subtract the mean of the 4 quarters from each quarter and then add the annual value to each quarter. In the second case we subtract the mean of the last 3 quarters from the last 3 quarters and add the annual.

In each case averaging the z_q_adj values over the four quarters of a year will recover the original annual value.

Here are the two approaches mentioned:
```
# 1
yr <- format(time(c), "%Y")
c$z_q_adj <- ave(coredata(c$z_q), yr, FUN = function(x) x - mean(x) + x[1])
```
giving:
```
> c
           z_a      z_q   z_q_adj
2000-01-01 100 100.0000  95.36604
2000-04-01  NA 103.4434  98.80946
2000-07-01  NA 106.4080 101.77405
2000-10-01  NA 108.6844 104.05046
2001-01-01 110 110.0000 109.39295
2001-04-01  NA 110.5723 109.96527
2001-07-01  NA 110.8719 110.26484
2001-10-01  NA 110.9840 110.37694
2002-01-01 111 111.0000 110.86116
2002-04-01  NA 111.0150 110.87615
2002-07-01  NA 111.1219 110.98311
2002-10-01  NA 111.4184 111.27958


# 2
c$z_q_adj <- ave(coredata(c$z_q), yr, FUN = function(x) c(x[1], x[-1] - mean(x[-1]) +x[1]))
```
giving:
```
> c
           z_a      z_q  z_q_adj
2000-01-01 100 100.0000 100.0000
2000-04-01  NA 103.4434  97.2648
2000-07-01  NA 106.4080 100.2294
2000-10-01  NA 108.6844 102.5058
2001-01-01 110 110.0000 110.0000
2001-04-01  NA 110.5723 109.7629
2001-07-01  NA 110.8719 110.0625
2001-10-01  NA 110.9840 110.1746
2002-01-01 111 111.0000 111.0000
2002-04-01  NA 111.0150 110.8299
2002-07-01  NA 111.1219 110.9368
2002-10-01  NA 111.4184 111.2333
```
ADDED If you want to know whether a series was interpolated or not some approaches are:
- add a comment to the series, e.g. comment(c) <- "Originally annual", or
- use a naming convention, e.g. add _a to the series name if it was originally annual: c_a <- c, or
- if it's OK to retain both the c_q and c_q_adj columns then for series that originated from quarterly data the two columns should be the same and otherwise not, or
- keep a column for both the original data and the quarterly data
0 讨论(0)
发布评论:

提交评论
- 加载中...
小鲜肉

2021-01-06 09:04
Perhaps I'm missing something here, but assuming the annual value always comes from the first quarter, couldn't you just replace mean in your aggregate call with min?
```
 > d <- aggregate(c, as.integer(format(index(c),"%Y")), min, na.rm=TRUE)
 > d
      z_a z_q
 2000 100 100
 2001 110 110
 2002 111 111
```
0 讨论(0)
发布评论:

提交评论
- 加载中...