问题
Looking for a function in R to convert dates into week numbers (of year) I went for week
from package data.table
.
However, I observed some strange behaviour:
> week("2014-03-16") # Sun, expecting 11
[1] 11
> week("2014-03-17") # Mon, expecting 12
[1] 11
> week("2014-03-18") # Tue, expecting 12
[1] 12
Why is the week number switching to 12 on tuesday, instead of monday? What am I missing? (Timezone should be irrelevant as there are just dates?!)
Other suggestions for (base) R functions are appreciated as well.
回答1:
Base package
Using the function strftime
passing the argument %V
to obtain the week of the year as decimal number (01–53) as defined in ISO 8601. (More details in the documentarion: ?strftime)
strftime(c("2014-03-16", "2014-03-17","2014-03-18", "2014-01-01"), format = "%V")
Output:
[1] "11" "12" "12" "01"
回答2:
if you try with lubridate:
library(lubridate)
lubridate::week(ymd("2014-03-16", "2014-03-17","2014-03-18", '2014-01-01'))
[1] 11 11 12 1
The pattern is the same. Try isoweek
lubridate::isoweek(ymd("2014-03-16", "2014-03-17","2014-03-18", '2014-01-01'))
[1] 11 12 12 1
回答3:
Actually, I think you may have discovered a bug in the week(...)
function, or at least an error in the documentation. Hopefully someone will jump in and explain why I am wrong.
Looking at the code:
library(lubridate)
> week
function (x)
yday(x)%/%7 + 1
<environment: namespace:lubridate>
The documentation states:
Weeks is the number of complete seven day periods that have occured between the date and January 1st, plus one.
But since Jan 1 is the first day of the year (not the zeroth), the first "week" will be a six day period. The code should (??) be
(yday(x)-1)%/%7 + 1
NB: You are using week(...)
in the data.table
package, which is the same code as lubridate::week
except it coerces everything to integer rather than numeric for efficiency. So this function has the same problem (??).
回答4:
I think the problem is that the week
calculation somehow uses the first day of the year. I don't understand the internal mechanics, but you can see what I mean with this example:
library(data.table)
dd <- seq(as.IDate("2013-12-20"), as.IDate("2014-01-20"), 1)
# dd <- seq(as.IDate("2013-12-01"), as.IDate("2014-03-31"), 1)
dt <- data.table(i = 1:length(dd),
day = dd,
weekday = weekdays(dd),
day_rounded = round(dd, "weeks"))
## Now let's add the weekdays for the "rounded" date
dt[ , weekday_rounded := weekdays(day_rounded)]
## This seems to make internal sense with the "week" calculation
dt[ , weeknumber := week(day)]
dt
i day weekday day_rounded weekday_rounded weeknumber
1: 1 2013-12-20 Friday 2013-12-17 Tuesday 51
2: 2 2013-12-21 Saturday 2013-12-17 Tuesday 51
3: 3 2013-12-22 Sunday 2013-12-17 Tuesday 51
4: 4 2013-12-23 Monday 2013-12-24 Tuesday 52
5: 5 2013-12-24 Tuesday 2013-12-24 Tuesday 52
6: 6 2013-12-25 Wednesday 2013-12-24 Tuesday 52
7: 7 2013-12-26 Thursday 2013-12-24 Tuesday 52
8: 8 2013-12-27 Friday 2013-12-24 Tuesday 52
9: 9 2013-12-28 Saturday 2013-12-24 Tuesday 52
10: 10 2013-12-29 Sunday 2013-12-24 Tuesday 52
11: 11 2013-12-30 Monday 2013-12-31 Tuesday 53
12: 12 2013-12-31 Tuesday 2013-12-31 Tuesday 53
13: 13 2014-01-01 Wednesday 2014-01-01 Wednesday 1
14: 14 2014-01-02 Thursday 2014-01-01 Wednesday 1
15: 15 2014-01-03 Friday 2014-01-01 Wednesday 1
16: 16 2014-01-04 Saturday 2014-01-01 Wednesday 1
17: 17 2014-01-05 Sunday 2014-01-01 Wednesday 1
18: 18 2014-01-06 Monday 2014-01-01 Wednesday 1
19: 19 2014-01-07 Tuesday 2014-01-08 Wednesday 2
20: 20 2014-01-08 Wednesday 2014-01-08 Wednesday 2
21: 21 2014-01-09 Thursday 2014-01-08 Wednesday 2
22: 22 2014-01-10 Friday 2014-01-08 Wednesday 2
23: 23 2014-01-11 Saturday 2014-01-08 Wednesday 2
24: 24 2014-01-12 Sunday 2014-01-08 Wednesday 2
25: 25 2014-01-13 Monday 2014-01-08 Wednesday 2
26: 26 2014-01-14 Tuesday 2014-01-15 Wednesday 3
27: 27 2014-01-15 Wednesday 2014-01-15 Wednesday 3
28: 28 2014-01-16 Thursday 2014-01-15 Wednesday 3
29: 29 2014-01-17 Friday 2014-01-15 Wednesday 3
30: 30 2014-01-18 Saturday 2014-01-15 Wednesday 3
31: 31 2014-01-19 Sunday 2014-01-15 Wednesday 3
32: 32 2014-01-20 Monday 2014-01-15 Wednesday 3
i day weekday day_rounded weekday_rounded weeknumber
My workaround is this function: https://github.com/geneorama/geneorama/blob/master/R/round_weeks.R
round_weeks <- function(x){
require(data.table)
dt <- data.table(i = 1:length(x),
day = x,
weekday = weekdays(x))
offset <- data.table(weekday = c('Sunday', 'Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday'),
offset = -(0:6))
dt <- merge(dt, offset, by="weekday")
dt[ , day_adj := day + offset]
setkey(dt, i)
return(dt[ , day_adj])
}
Of course, you can easily change the offset to make Monday first or whatever. The best way to do this would be to add an offset to the offset... but I haven't done that yet.
I provided a link to my simple geneorama package, but please don't rely on it too much because it's likely to change and not very documented.
回答5:
if you want to get the week number with the year use: "%Y-W%V"
:
e.g yearAndweeks <- strftime(dates, format = "%Y-W%V")
so
> strftime(c("2014-03-16", "2014-03-17","2014-03-18", "2014-01-01"), format = "%Y-W%V")
becomes:
[1] "2014-W11" "2014-W12" "2014-W12" "2014-W01
"
回答6:
I understand the need for packages in certain situations, but the base language is so elegant and so proven (and debugged and optimized).
Why not:
dt <- as.Date("2014-03-16")
dt2 <- as.POSIXlt(dt)
dt2$yday
[1] 74
And then your choice whether the first week of the year is zero (as in indexing in C) or 1 (as in indexing in R).
No packages to learn, update, worry about bugs in.
回答7:
Using only base, I wrote the following function.
Note:
- Assumes Mon is day number 1 in the week
- First week is week 1
- Returns 0 if week is 52 from last year
Fine-tune to suit your needs.
findWeekNo <- function(myDate){
# Find out the start day of week 1; that is the date of first Mon in the year
weekday <- switch(weekdays(as.Date(paste(format(as.Date(myDate),"%Y"),"01-01", sep = "-"))),
"Monday"={1},
"Tuesday"={2},
"Wednesday"={3},
"Thursday"={4},
"Friday"={5},
"Saturday"={6},
"Sunday"={7}
)
firstMon <- ifelse(weekday==1,1, 9 - weekday )
weekNo <- floor((as.POSIXlt(myDate)$yday - (firstMon-1))/7)+1
return(weekNo)
}
findWeekNo("2017-01-15") # 2
回答8:
If you want to get the week number with the year, Grant Shannon's solution using strftime works, but you need to make some corrections for the dates around january 1st. For instance, 2016-01-03 (yyyy-mm-dd) is week 53 of year 2015, not 2016. And 2018-12-31 is week 1 of 2019, not of 2018. This codes provides some examples and a solution. In column "yearweek" the years are sometimes wrong, in "yearweek2" they are corrected (rows 2 and 5).
library(dplyr)
library(lubridate)
# create a testset
test <- data.frame(matrix(data = c("2015-12-31",
"2016-01-03",
"2016-01-04",
"2018-12-30",
"2018-12-31",
"2019-01-01") , ncol=1, nrow = 6 ))
# add a colname
colnames(test) <- "date_txt"
# this codes provides correct year-week numbers
test <- test %>%
mutate(date = as.Date(date_txt, format = "%Y-%m-%d")) %>%
mutate(yearweek = as.integer(strftime(date, format = "%Y%V"))) %>%
mutate(yearweek2 = ifelse(test = day(date) > 7 & substr(yearweek, 5, 6) == '01',
yes = yearweek + 100,
no = ifelse(test = month(date) == 1 & as.integer(substr(yearweek, 5, 6)) > 51,
yes = yearweek - 100,
no = yearweek)))
# print the result
print(test)
date_txt date yearweek yearweek2
1 2015-12-31 2015-12-31 201553 201553
2 2016-01-03 2016-01-03 201653 201553
3 2016-01-04 2016-01-04 201601 201601
4 2018-12-30 2018-12-30 201852 201852
5 2018-12-31 2018-12-31 201801 201901
6 2019-01-01 2019-01-01 201901 201901
来源:https://stackoverflow.com/questions/22439540/how-to-get-week-numbers-from-dates