R - predicting simple dyn model with one lag term

问题

I'm trying to predict a simple lagged time series regression with the dyn library in R. This question was a helpful starting point, but I'm getting some weird behaviour that I'm hoping someone can explain.

Here's a minimum working example.

library(dyn)

# Initial data
y.orig <- arima.sim(model=list(ar=c(.9)),n=10)
x1.orig <- rnorm(10)
data <- cbind(y=y.orig, x1=x1.orig)

# This model, with a single lag term, predicts from t=2
mod1 <- dyn$lm(y ~ lag(y, -1), data)
y.new <- window(y.orig, end=end(y.orig) + c(5,0), extend=TRUE)
newdata1 <- cbind(y=y.new)
predict(mod1, newdata1)

# This one, with a lag plus another predictor, predicts from t=1 on
mod2 <- dyn$lm(y ~ lag(y, -1) + x1, data)
y.new <- window(y.orig, end=end(y.orig) + c(5,0), extend=TRUE)
x1.new <- c(x1.orig, rnorm(5))
newdata2 <- cbind(y=y.new, x1=x1.new)
predict(mod2, newdata2)

Why is there the difference between the two? Can anyone suggest how to predict my ''mod1'' using dyn? Thanks in advance.

回答1:

Both mod1 and mod2 start predicting at t=2. The prediction vector for mod2 starts at t=1 but its NA. Regarding why one starts at 2 and the other at 1 note that predict merges together the variables on the right hand side of the formula and in the case of mod1 we see that lag(y, -1) starts at t=2 since y starts at t=1. On the other hand in the case of mod2 when we merge lag(y, -1) and x1 we get a series that starts at t=1 (since x1 starts at t=1). Try this which does not involve dyn:

> start(with(as.list(newdata1), merge.zoo(lag(y, -1))))
[1] 2
> start(with(as.list(newdata2), merge.zoo(lag(y, -1), x1)))
[1] 1

If we wanted predict(mod1, newdata1) to start at t=1 we could add our own Intercept column and remove the default intercept to avoid duplication. That would force it to start at 1 since now the RHS has a series which starts at 1:

data.b <- cbind(y=y.orig, x1=x1.orig, Intercept = 1)
mod.b <- dyn$lm(y ~ Intercept + lag(y, -1) - 1, data.b)

newdata.b <- cbind(Intercept = 1, y = y.new)
predict(mod.b, newdata.b)

Regarding the second question, if you want to predict mod1 then use fitted(mod1) .

It seems there is lurking some third question about how it basically all works so maybe this clarifies it. All dyn does is to align the time series in the formula and then lm and predict can be run as usual. For example, if we create an aligned model frame using dyn$model.frame then everything else can be done using just ordinary lm and ordinary predict and dyn is not involved from that point onwards. Below mod1a is similar to mod1 from the question except it runs an ordinary lm on the aligned model frame. If you understand the mod1a lm and its predict then mod1 and predict are similar.

## mod1 and mod1a are similar

# from code in the question
mod1 <- dyn$lm(y ~ lag(y, -1), data = data)
mod1

# redo it using a plain lm by applying dyn to model.frame
mf <- dyn$model.frame(y ~ lag(y, -1), data = data)
mod1a <- lm(y ~ `lag(y, -1)`, mf)
mod1a

## the two predicts below are similar

# the 1 ensures its an mts rather than ts but is otherwise not used
newdata1 <- cbind(y=y.new, 1) 
predict(mod1, newdata1)

newdata1a <- cbind(1, `lag(y, -1)` = lag(y.new, -1))
predict(mod1a, newdata1a)

来源：https://stackoverflow.com/questions/11215868/r-predicting-simple-dyn-model-with-one-lag-term

标签

dynamic

regression

forecasting