Interpolation in R: retrieving hourly values

女生的网名这么多〃 提交于 2021-01-28 04:30:11

问题


I recognize there are several related questions, but I seem to be stumbling somewhere here. I followed this thread as best I could: Interpolating timeseries, but get error messages (see below) :

My dataset contains samples collected every four hours everyday. I would like to interpolate these data into hourly values. Below is a subsample of my much larger dataset:

vis <- structure(list(datetime = structure(1:24, .Label = c("2002-05-01-00", 
"2002-05-01-06", "2002-05-01-12", "2002-05-01-18", "2002-05-02-00", 
"2002-05-02-06", "2002-05-02-12", "2002-05-02-18", "2002-05-03-00", 
"2002-05-03-06", "2002-05-03-12", "2002-05-03-18", "2002-05-04-00", 
"2002-05-04-06", "2002-05-04-12", "2002-05-04-18", "2002-05-05-00", 
"2002-05-05-06", "2002-05-05-12", "2002-05-05-18", "2002-05-06-00", 
"2002-05-06-06", "2002-05-06-12", "2002-05-06-18"), class = "factor"), 
    VIStot = c(0L, 128L, 359L, 160L, 1L, 121L, 316L, 162L, 1L, 
    132L, 339L, 163L, 2L, 137L, 364L, 155L, 3L, 122L, 345L, 179L, 
    3L, 125L, 147L, 77L)), .Names = c("datetime", "VIStot"), class = "data.frame", row.names = c(NA, 
-24L))

My code to interpolate to hourly resolution is as follows:

vis[, c(2)] <- sapply(vis[, c(2)], as.numeric)
library(zoo)
vis$datetime <- as.POSIXct(vis$datetime, format="%Y-%m-%d-%H")
hr <- zoo(vis$VIStot, vis$datetime)

int <- na.spline(hr$VIStot)

This ends with the error message

Error in $.zoo(hr, VIStot) : not possible for univariate zoo series

Am I not formatting the datetime correctly? Why is hr not reading both VIStot and datetime?

Also, once interpolated, I would like to export the values in a .csv file format.


回答1:


Two thoughts on this. First, the function na.spline is wanting to impute NA values in vis$VIStot, of which there are none. So perhaps your first issue is that you are not generating a proper sequence on which the function can operate.

Second, if you are looking for simple interpolation, then how about:

## using your "vis" above
newdt <- seq.POSIXt(vis$datetime[1], tail(vis$datetime, n=1), by='1 hour')
data.frame(datetime=newdt, VIStot=approx(vis$datetime, vis$VIStot, newdt)$y)
##                datetime    VIStot
## 1   2002-05-01 00:00:00   0.00000
## 2   2002-05-01 01:00:00  21.33333
## 3   2002-05-01 02:00:00  42.66667
## 4   2002-05-01 03:00:00  64.00000
## 5   2002-05-01 04:00:00  85.33333
## 6   2002-05-01 05:00:00 106.66667

I recognize this is somewhat of a workaround, but you can covert into your zoo object easily from here.

Another way I got it to work:

library(zoo)
vis2 <- merge(vis, data.frame(datetime=newdt), by.x='datetime', all.y=TRUE)
head(vis2, n=8)
##              datetime VIStot
## 1 2002-05-01 00:00:00      0
## 2 2002-05-01 01:00:00     NA
## 3 2002-05-01 02:00:00     NA
## 4 2002-05-01 03:00:00     NA
## 5 2002-05-01 04:00:00     NA
## 6 2002-05-01 05:00:00     NA
## 7 2002-05-01 06:00:00    128
## 8 2002-05-01 07:00:00     NA

hr2 <- zoo(vis2$VIStot, vis2$datetime)
head(hr2, n=8)
## 2002-05-01 00:00:00 2002-05-01 01:00:00 2002-05-01 02:00:00 
##                   0                  NA                  NA 
## 2002-05-01 03:00:00 2002-05-01 04:00:00 2002-05-01 05:00:00 
##                  NA                  NA                  NA 
## 2002-05-01 06:00:00 2002-05-01 07:00:00 
##                 128                  NA 

Voici:

head(na.spline(hr2), n=8)
## 2002-05-01 00:00:00 2002-05-01 01:00:00 2002-05-01 02:00:00 
##            0.000000          -12.533246           -8.229736 
## 2002-05-01 03:00:00 2002-05-01 04:00:00 2002-05-01 05:00:00 
##           10.442935           41.017177           81.025396 
## 2002-05-01 06:00:00 2002-05-01 07:00:00 
##          128.000000          179.200449 

Whether you need interpolation, a spline, or something else, perhaps this will get you moving in the right direction.




回答2:


hr has no columns so hr$VIStot is erroneous.

Try this. We create an hourly sequence, tt, and then evaluate splines based on hr at those values:

rng <- range(time(hr))
tt <- seq(rng[1], rng[2], by = "hour")
z <- na.spline(hr, xout = tt)

This gives the following:

> head(z)
2002-05-01 00:00:00 2002-05-01 01:00:00 2002-05-01 02:00:00 2002-05-01 03:00:00 
           0.000000          -12.533246           -8.229736           10.442935 
2002-05-01 04:00:00 2002-05-01 05:00:00 
          41.017177           81.025396 

and:

plot(z, type = "o")
points(hr, pch = 20, col = "red") # original points made red



来源:https://stackoverflow.com/questions/32598443/interpolation-in-r-retrieving-hourly-values

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!