问题
I have a time-series data for the last 20 years. The variable has been measured every year so I have 20 values. I have a tab-delimited file with first column representing year and second column the value. Here is what it looks like :
1991 438 1992 408 1993 381 1994 361 1995 338 1996 315 1997 289 1998 261 1999 229 2000 206 2001 190 2002 173 2003 151 2004 141 2005 126 2006 108 2007 99 2008 93 2009 85 2010 77 2011 71 2012 67
I want to extrapolate the value of second column for coming years. The rate at which values in second column is decreasing is also going down so I think we can't use linear regression. I wish to know in which year the second column will approach the value of zero. I have never used R so it would be great if you can even help me with code that will be used to read the data from a tab-delimited file.
Thanks
回答1:
The following is a sketch that may help you get started.
## get the data
tmp <- read.table(text="1991 438
1992 408
1993 381
1994 361
1995 338
1996 315
1997 289
1998 261
1999 229
2000 206
2001 190
2002 173
2003 151
2004 141
2005 126
2006 108
2007 99
2008 93
2009 85
2010 77
2011 71
2012 67", col.names=c("Year", "value"))
library(ggplot2)
## develop a model
tmp$pred1 <- predict(lm(value ~ poly(Year, 2), data=tmp))
## look at the data
p1 <- ggplot(tmp, aes(x = Year, y=value)) +
geom_line() +
geom_point() +
geom_hline(aes(yintercept=0))
print(p1)
## check the model
p1 +
geom_line(aes(y = pred1), color="red")
## extrapolate based on model
pred <- data.frame(Year=1990:2050)
pred$value <- predict(lm(value ~ poly(Year, 2), data=tmp),newdata=pred)
p1 +
geom_line(color="red", data=pred)
In this case our model says the line will never cross zero. If that makes no sense then you'll want to pick a different model. Whatever model you pick, graph the result along with the data so you can see how well you're doing.
回答2:
To read in the data from formatted file:
require(utils) # (make sure you have 'utils' package installed!)
data <- read.table('<filename>', header=FALSE, colnames=c('Year','Value'))
and see the read.table manpage
To extrapolate the data:
as EDi and Dirk said you need to do a little reading. Decide what sort of extrapolation fn you want: linear (Hmisc::approxExtrap for linear extrapolation; approxfun does interpolation but not extrapolation), spline(stats::splinefun or splines package), etc. splinefun
is probably ok for your case.
Specifically for forecasting time-series, see forecast (you should also browse related SO questions).
After you skim those manpages, try something out, post some code and tell us where you're stuck, can respond more. Otherwise you'll get flamed mercilessly and your question will likely be closed as 'Give me teh codez' ;-)
来源:https://stackoverflow.com/questions/15535877/extrapolate-in-r-for-a-time-series-data