Aggregate count of timeseries values which exceed threshold, by year-month

别说谁变了你拦得住时间么 提交于 2021-01-29 18:10:41

问题


I am now learning R and using the SEAS package to help me with some calculation in R and data is the same format as SEAS package likes. It is a time series

require(seas)
data(mscdata)
dat.int <- (mksub(mscdata, id=1108447))

the heading of the data and it is 20 years of data

  year yday  date t_max t_min t_mean rain snow precip

However, I now need to calculate the number of days in each month rainfall is >= 1.0mm . So at the end of it. I would have two columns ( each month in each year and total # of days in each month rainfall>= 1.0mm )

I'm not certain how to write this code and any help would be appreciated

Thank you

Lam


回答1:


I now need to calculate the number of days in each month rainfall is >= 1.0mm. So at the end of it. I would have two columns ( each month in each year and total # of days in each month rainfall>= 1.0mm )

1) So dat.int$date is a Date object. First step is you need to create a new column dat.int$yearmon extracting the year-month, e.g. using zoo::yearmon Extract month and year from a zoo::yearmon object

require(zoo)
dat.int$yearmon <- as.yearmon(dat.int$date, "%b %y")

2) Second, you need to do a summarize operation (recommend you use plyr or the newer dplyr) on rain>=1.0 aggregated by yearmon. Let's name our resulting column rainy_days.

If you want to store rainy_days column back into the dat.int dataframe, you use a transform instead of a summarize:

ddply(dat.int, .(yearmon), transform, rainy_days=sum(rain >= 1.0) )

or else if you really just want a new summary dataframe:

require(plyr)
rainydays_by_yearmon <- ddply(dat.int, .(yearmon), summarize, rainy_days=sum(rain >= 1.0) )
print.data.frame(rainydays_by_yearmon)

     yearmon rainy_days
1   Jan 1975         14
2   Feb 1975         12
3   Mar 1975         13
4   Apr 1975          6
5   May 1975          6
6   Jun 1975          5
...
355 Jul 2004          3
356 Aug 2004          7
357 Oct 2004         14
358 Nov 2004         16
359 Dec 2004         19

Note: you can do the above with plain old R, without using zoo or plyr/dplyr packages. But might as well teach you nicer, more scalable, maintainable code idioms.



来源:https://stackoverflow.com/questions/26473724/aggregate-count-of-timeseries-values-which-exceed-threshold-by-year-month

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!