Calculating hourly averages from a multi-year timeseries

前端 未结 3 1293
栀梦
栀梦 2021-01-03 07:47

I have a dataset filled with the average windspeed per hour for multiple years. I would like to create an \'average year\', in which for each hour the average windspeed for

相关标签:
3条回答
  • 2021-01-03 08:21

    It is pretty old post, but I wanted to add. I guess timeAverage in Openair can also be used. In the manual, there are more options for timeAverage function.

    0 讨论(0)
  • 2021-01-03 08:27

    You can use substr to extract the part of the date you want, and then use tapply or ddply to aggregate the data.

    tapply(
      data.multipleyears$Windspeed, 
      substr( data.multipleyears$DATETIME, 6, 19), 
      mean 
    )
    # 01-01 01:00:00 02-29 12:00:00 05-03 09:00:00 
    #              9              3              5 
    
    library(plyr)
    ddply(
      data.multipleyears, 
      .(when=substr(DATETIME, 6, 19)), 
      summarize, 
      Windspeed=mean(Windspeed)
    )
    #             when Windspeed
    # 1 01-01 01:00:00         9
    # 2 02-29 12:00:00         3
    # 3 05-03 09:00:00         5
    
    0 讨论(0)
  • 2021-01-03 08:33

    I predict that ddply and the plyr package are going to be your best friend :). I created a 30 year dataset with hourly random windspeeds between 1 and 10 ms:

    begin_date = as.POSIXlt("1990-01-01", tz = "GMT")
    # 30 year dataset
    dat = data.frame(dt = begin_date + (0:(24*30*365)) * (3600))
    dat = within(dat, {
      speed = runif(length(dt), 1, 10)
      unique_day = strftime(dt, "%d-%m")
    })
    > head(dat)
                       dt unique_day    speed
    1 1990-01-01 00:00:00      01-01 7.054124
    2 1990-01-01 01:00:00      01-01 2.202591
    3 1990-01-01 02:00:00      01-01 4.111633
    4 1990-01-01 03:00:00      01-01 2.687808
    5 1990-01-01 04:00:00      01-01 8.643168
    6 1990-01-01 05:00:00      01-01 5.499421
    

    To calculate the daily normalen (30 year average, this term is much used in meteorology) over this 30 year period:

    library(plyr)
    res = ddply(dat, .(unique_day), 
                summarise, mean_speed = mean(speed), .progress = "text")
    > head(res)
      unique_day mean_speed
    1      01-01   5.314061
    2      01-02   5.677753
    3      01-03   5.395054
    4      01-04   5.236488
    5      01-05   5.436896
    6      01-06   5.544966
    

    This takes just a few seconds on my humble two core AMD, so I suspect just going once through the data is not needed. Multiple of these ddply calls for different aggregations (month, season etc) can be done separately.

    0 讨论(0)
提交回复
热议问题