Calculating hourly averages from a multi-year timeseries

前端未结

关注

 3  1293

I have a dataset filled with the average windspeed per hour for multiple years. I would like to create an \'average year\', in which for each hour the average windspeed for

相关标签:

3条回答

逝去的感伤

2021-01-03 08:21

It is pretty old post, but I wanted to add. I guess timeAverage in Openair can also be used. In the manual, there are more options for timeAverage function.

0 讨论(0)
发布评论:

提交评论
- 加载中...

-上瘾入骨i

2021-01-03 08:27

You can use substr to extract the part of the date you want, and then use tapply or ddply to aggregate the data.

tapply(
  data.multipleyears$Windspeed, 
  substr( data.multipleyears$DATETIME, 6, 19), 
  mean 
)
# 01-01 01:00:00 02-29 12:00:00 05-03 09:00:00 
#              9              3              5 

library(plyr)
ddply(
  data.multipleyears, 
  .(when=substr(DATETIME, 6, 19)), 
  summarize, 
  Windspeed=mean(Windspeed)
)
#             when Windspeed
# 1 01-01 01:00:00         9
# 2 02-29 12:00:00         3
# 3 05-03 09:00:00         5

0 讨论(0)

[愿得一人]

2021-01-03 08:33

I predict that ddply and the plyr package are going to be your best friend :). I created a 30 year dataset with hourly random windspeeds between 1 and 10 ms:

begin_date = as.POSIXlt("1990-01-01", tz = "GMT")
# 30 year dataset
dat = data.frame(dt = begin_date + (0:(24*30*365)) * (3600))
dat = within(dat, {
  speed = runif(length(dt), 1, 10)
  unique_day = strftime(dt, "%d-%m")
})
> head(dat)
                   dt unique_day    speed
1 1990-01-01 00:00:00      01-01 7.054124
2 1990-01-01 01:00:00      01-01 2.202591
3 1990-01-01 02:00:00      01-01 4.111633
4 1990-01-01 03:00:00      01-01 2.687808
5 1990-01-01 04:00:00      01-01 8.643168
6 1990-01-01 05:00:00      01-01 5.499421

To calculate the daily normalen (30 year average, this term is much used in meteorology) over this 30 year period:

library(plyr)
res = ddply(dat, .(unique_day), 
            summarise, mean_speed = mean(speed), .progress = "text")
> head(res)
  unique_day mean_speed
1      01-01   5.314061
2      01-02   5.677753
3      01-03   5.395054
4      01-04   5.236488
5      01-05   5.436896
6      01-06   5.544966

This takes just a few seconds on my humble two core AMD, so I suspect just going once through the data is not needed. Multiple of these ddply calls for different aggregations (month, season etc) can be done separately.

0 讨论(0)