Aggregating minutes to hour demand

99封情书 提交于 2019-12-11 01:43:51

问题


I don't know if I am in the right section for this question, I've looked around and did not find an answer so here is my question:

I have a CSV file ordered as follows:

dat <- read.csv(text="Date,Demand
01/01/2012 00:00:00,5061.5
01/01/2012 00:05:00,5030.0
01/01/2012 00:10:00,5011.5
01/01/2012 00:15:00,4983.5
01/01/2012 00:20:00,4963.4
01/01/2012 00:25:00,4980.6
01/01/2012 00:30:00,4969.4
01/01/2012 00:35:00,4961.7
01/01/2012 00:40:00,4929.0
01/01/2012 00:45:00,4907.1
01/01/2012 00:50:00,4892.8
01/01/2012 00:55:00,4870.1
01/01/2012 01:00:00,4860.4",header=TRUE)

The date format is, I guess, %m-%d-%Y-%H-%M-%S

I'd like to summarize the demand in order to obtain an aggregation on the hour as follows:

01/01/2012 00:00:00.................59 560.6 MGW/h  
#which is the sum of the 12th first date.    
01/01/2012 01:00:00.................xxxxxxx  MGW/h    
01/01/2012 02:00:00.................xxxxxxx MGW/h    

Of course my file is way larger than that, I have a total of more than 1 million lines

So, I hope I made myself understandable enough for you, maybe there is also a date format problem. If so, does someone know how to change it in the good one, I tried with as.Date but the result is not the expected one.


回答1:


Using the example data, something like this could work:

aggregate(
  list(Demand=dat$Demand),
  list(DateAgg=
   as.POSIXct(trunc(as.POSIXct(dat$Date,format="%m/%d/%Y %H:%M:%S"),"hours"))
  ),
  FUN=sum
)


#              DateAgg  Demand
#1 2012-01-01 00:00:00 59560.6
#2 2012-01-01 01:00:00  4860.4



回答2:


I recommend you check out xts package which is very good for any time series analysis.

Following example will show how you can get sums over any periodicity

require(xts)
#Convert data to xts format
dat.xts <- xts(dat$Demand, order.by = as.POSIXct(dat$Date, format = "%m/%d/%Y %H:%M:%S"))

period.sum(x = dat.xts, INDEX = endpoints(dat.xts, on = "hours"))
##                        [,1]
## 2012-01-01 00:55:00 59560.6
## 2012-01-01 01:00:00  4860.4

More generic example below showing how you can apply any function over any periodicity

period.apply(dat.xts, INDEX = endpoints(dat.xts, on = "mins", k = 20), FUN = "sum")
##                        [,1]
## 2012-01-01 00:15:00 20086.5
## 2012-01-01 00:35:00 19875.1
## 2012-01-01 00:55:00 19599.0
## 2012-01-01 01:00:00  4860.4

In above examples endpoints function create INDEX of end points of periods over which you want to apply any function.



来源:https://stackoverflow.com/questions/22368796/aggregating-minutes-to-hour-demand

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!