I have a dataset with locations and dates. I would like to calculate week of the year as number (00–53) but using Thursday as the first day of the week. The data looks like
Since the question stated that week goes from 00-53 we assume that the week number is the number of Thursdays in the year on or before the date in question. Thus, the first Thursday in the year begins week 1 and week 0 is assigned to any days prior to that.
(There were comments that if the first day of the year were Tuesday then that would be week 1 but if that were the case there could never be a week 0 as seems to be required in the subject so some clarification on precisely what the definition of week number is may be required. Here we are going to use the definition in the preceding paragraph but it would not be hard to change it if we knew what the definition was. For example, if we always wanted the first week in the year to be 1 even if it were a short week then we could add !is.thu(jan1(d))
to the result.)
Both of the solutions below are short enough that they could be expressed in one statement; however, we have factored them into several short functions each for clarity. The first is particularly straight forward but the second is automatically vectorized without the need for a sapply
and would likely be more efficient.
1. sum Thursdays in year This solution assumes the input d
is of class "Date"
and just sums the number of Thursdays in the year before or on it:
is.thu <- function(x) weekdays(x) == "Thursday"
jan1 <- function(x) as.Date(cut(x, "year"))
week4 <- function(d) {
sapply(d, function(d) sum(is.thu(seq(jan1(d), d, by = "day"))))
}
We can test it like this:
d <- as.Date(c("2013-01-04", "2013-01-26", "2013-02-03", "2013-02-09",
"2013-02-20", "2013-03-03"))
week4(d) # 1 4 5 6 7 9
2. nextthu
Based on the nextfri
function in the zoo quickref vignette we see that the number of days since the Epoch (1970-01-01) of the next Thursday (or the day in question if its already a Thursday) is as given by nextthu
in the first line below. Applying this to the first day of the year we derive the result where d
is as before:
nextthu <- function(d) 7 * ceiling(as.numeric(d) / 7)
week4a <- function(d) (as.numeric(d) - nextthu(jan1(d))) %/% 7 + 1
and here is a test
week4a(d) # 1 4 5 6 7 9
ADDED: fixed bug in second solution.
Just add 4 to the Date-formatted values:
> mydf$Dt <- as.Date(mydf$date, format="%d-%m-%Y")
> weeknum <- as.numeric( format(mydf$Dt+3, "%U"))
> weeknum
[1] 1 4 5 6 7 9
This uses a 0 based counting convention since that is what strftime provides and we are just piggybacking off that code base, so the first Friday in a year that begins on Tuesday as was the case in 2013 would be a 1-week result. Add 1 to the value if you want a 1 based convention. (Fundamentally, Date-formated values are in an integer sequence from the "origin" so they don't really recognize years or weeks. Adding 4 just shifts the reference frame of the underlying Date-integer.)
Edit note. Changed to an add three strategy per Gabor's advice. .... which still does not address the question of how to deal with the last week of the prior year.