问题
Hi all this should be a straightforward question, I just can't seem to figure it out. I would like to break up this data set biweekly in order to look at the annual cycle in 2 week intervals. I do not want to summarize or aggregate the data. I would like to do exactly what the 'week' function is doing, but every 2 weeks instead. Below is an example of the data and code. Any help would be greatly appreciated!
DF<-dput(head(indiv))
structure(list(event.id = 1142811808:1142811813, timestamp = structure(c(1323154800,
1323200450, 1323202141, 1323203545, 1323208151, 1323209966), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), argos.altitude = c(43, 43, 39, 43,
44, 42), argos.best.level = c(0, -136, -128, -136, -126, -137
), argos.calcul.freq = c(0, 676813.1, 676802.4, 676813.1, 676810,
676811.8), argos.lat1 = c(43.857, 43.916, 43.87, 43.89, 43.891,
43.89), argos.lat2 = c(43.857, 35.141, 49.688, 35.254, 40.546,
54.928), argos.lc = structure(c(7L, 6L, 2L, 3L, 4L, 3L), .Label = c("0",
"1", "2", "3", "A", "B", "G", "Z"), class = "factor"), argos.lon1 = c(-77.244,
-77.326, -77.223, -77.21, -77.208, -77.21), argos.lon2 = c(-77.244,
-121.452, -46.86, -118.496, -94.12, -16.159), argos.nb.mes.identical = c(0L,
2L, 6L, 4L, 5L, 6L), argos.nopc = c(0L, 1L, 2L, 3L, 4L, 4L),
argos.sensor.1 = c(0L, 149L, 194L, 1L, 193L, 193L), argos.sensor.2 = c(0L,
220L, 216L, 1L, 216L, 212L), argos.sensor.3 = c(0L, 1L, 1L,
0L, 3L, 1L), argos.sensor.4 = c(0L, 1L, 5L, 1L, 5L, 5L),
tag.local.identifier = c(112571L, 112571L, 112571L, 112571L,
112571L, 112571L), utm.easting = c(319655.836066914, 313250.096346666,
321382.422921619, 322486.41178559, 322650.029658403, 322486.41178559
), utm.northing = c(4858437.89950188, 4865173.18448801, 4859836.18321128,
4862029.54057323, 4862136.31345349, 4862029.54057323), utm.zone = structure(c(7L,
7L, 7L, 7L, 7L, 7L), .Label = c("12N", "13N", "14N", "15N",
"16N", "17N", "18N", "19N", "20N", "22N", "39N"), class = "factor"),
study.timezone = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Eastern Daylight Time",
"Eastern Standard Time"), class = "factor"), study.local.timestamp = structure(c(1323154800,
1323200450, 1323202141, 1323203545, 1323208151, 1323209966
), class = c("POSIXct", "POSIXt"), tzone = "")), row.names = 1120:1125, class = "data.frame")
weeknumber<-week(timestamps(DF))
回答1:
I don't use lubridate
, but here's a base R solution to subset your data fortnightly. We look if the week numbers as numeric modulo 2 are not zero and the year week is not duplicated. All using strftime
.
res <- DF[as.numeric(strftime(DF$timestamp, "%U")) %% 2 != 0 &
!duplicated(strftime(DF$timestamp, "%U %y")), ]
res
# timestamp x
# 1 2011-12-06 01:00:00 0.73178884
# 13 2011-12-18 01:00:00 -0.19310018
# 27 2012-01-01 01:00:00 1.13017531
# 41 2012-01-15 01:00:00 1.06546084
# 55 2012-01-29 01:00:00 -0.16664011
# 69 2012-02-12 01:00:00 -1.86596108
# 83 2012-02-26 01:00:00 0.59200189
# 97 2012-03-11 01:00:00 1.08327366
# 111 2012-03-25 01:00:00 -0.71291090
# 125 2012-04-08 02:00:00 0.51984052
# 139 2012-04-22 02:00:00 0.32738506
# 153 2012-05-06 02:00:00 2.50837829
# 167 2012-05-20 02:00:00 0.75116168
# 181 2012-06-03 02:00:00 -0.56359736
# 195 2012-06-17 02:00:00 0.60658448
# 209 2012-07-01 02:00:00 -0.07242813
# 223 2012-07-15 02:00:00 0.13811301
# 237 2012-07-29 02:00:00 0.19454153
# 251 2012-08-12 02:00:00 0.23119092
# 265 2012-08-26 02:00:00 -0.97278351
# 279 2012-09-09 02:00:00 -1.18143276
# 293 2012-09-23 02:00:00 -0.43294048
# 307 2012-10-07 02:00:00 0.05664472
# 321 2012-10-21 02:00:00 -0.90725782
# 335 2012-11-04 01:00:00 0.78939068
# 349 2012-11-18 01:00:00 -0.46047924
# 363 2012-12-02 01:00:00 1.45941339
Check by differencing.
## check
diff(res$timestamp)
# Time differences in days
# [1] 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14
# [21] 14 14 14 14 14
Data:
DF <- data.frame(timestamp=as.POSIXct(seq(as.Date("2011-12-06"), as.Date("2012-12-06"), "day")),
x=rnorm(367))
回答2:
As I had said in my comment to your previous (since deleted) question, use seq.Date
and either cut
or findInterval
.
I'll create a vector of "every other Monday", starting on January 1st, 2011. This is arbitrary, but you will want to ensure that you choose (1) a day that is meaningful to you, (2) a start-point that is before your earliest data, and (3) a length.out=
that extends beyond your latest data.
every_other_monday <- seq(as.Date("2011-01-03"), by = "14 days", length.out = 26)
every_other_monday
# [1] "2011-01-03" "2011-01-17" "2011-01-31" "2011-02-14" "2011-02-28" "2011-03-14" "2011-03-28" "2011-04-11" "2011-04-25"
# [10] "2011-05-09" "2011-05-23" "2011-06-06" "2011-06-20" "2011-07-04" "2011-07-18" "2011-08-01" "2011-08-15" "2011-08-29"
# [19] "2011-09-12" "2011-09-26" "2011-10-10" "2011-10-24" "2011-11-07" "2011-11-21" "2011-12-05" "2011-12-19"
every_other_monday[ findInterval(as.Date(DF$timestamp), every_other_monday) ]
# [1] "2011-12-05" "2011-12-05" "2011-12-05" "2011-12-05" "2011-12-05" "2011-12-05"
(The choice to start on Jan 3 was conditioned on the assumption that your real data spans a much larger length of time. You don't need a full year's worth of biweeks in every_other_monday
, nor does it need to be a Monday, it can be whatever base-date you choose. So long as it includes at least one date before and after the actual DF
dates, you should be covered.)
Alternative: round to the week-level, then filter out those where the modulus of its julian day is odd. (The reason I chose "modulus of its julian day" is to reduce the chance that it could shift based on slight changes in data range.)
weeks <- lubridate::floor_date(as.Date(DF$timestamp), unit = "weeks")
weeks
# [1] "2011-12-04" "2011-12-04" "2011-12-04" "2011-12-04" "2011-12-04" "2011-12-04"
isodd <- as.POSIXlt(weeks)$yday %% 2 == 1
weeks[isodd] <- weeks[isodd] - 7L
weeks # technically, now "biweeks"
# [1] "2011-11-27" "2011-11-27" "2011-11-27" "2011-11-27" "2011-11-27" "2011-11-27"
来源:https://stackoverflow.com/questions/65687615/make-week-function-biweek