How to Parse Year + Week Number in R?

前端 未结 3 2010
天涯浪人
天涯浪人 2020-11-29 08:24

Is there a good way to get a year + week number converted a date in R? I have tried the following:

> as.POSIXct(\"2008 41\", format=\"%Y %U\")
[1] \"2008-         


        
相关标签:
3条回答
  • 2020-11-29 08:58

    Day-of-week == zero in the POSIXlt DateTimesClasses system is Sunday. Not exactly Biblical and not in agreement with the R indexing that starts at "1" convention either, but that's what it is. Week zero is the first (partial) week in the year. Week one (but day of week zero) starts with the first Sunday. And all the other sequence types in POSIXlt have 0 as their starting point. It kind of interesting to see what coercing the list elements of POSIXlt objects do. The only way you can actually change a POSIXlt date is to alter the $year, the $mon or the $mday elements. The others seem to be epiphenomena.

      today <- as.POSIXlt(Sys.Date())
      today  # Tuesday
    #[1] "2012-02-21 UTC"
         today$wday <- 0  # attempt to make it Sunday
         today
    # [1] "2012-02-21 UTC"   The attempt fails
     today$mday <- 19
     today
    #[1] "2012-02-19 UTC"   Success
    
    0 讨论(0)
  • 2020-11-29 09:17

    This is kinda like another question you may have seen before. :)

    The key issue is: what day should a week number specify? Is it the first day of the week? The last? That's ambiguous. I don't know if week one is the first day of the year or the 7th day of the year, or possibly the first Sunday or Monday of the year (which is a frequent interpretation). (And it's worse than that: these generally appear to be 0-indexed, rather than 1-indexed.) So, an enumerated day of the week needs to be specified.

    For instance, try this:

    as.POSIXlt("2008 42 1", format = "%Y %U %u")
    

    The %u indicator specifies the day of the week.

    Additional note: See ?strptime for the various options for format conversion. It's important to be careful about the enumeration of weeks, as these can be split across the end of the year, and day 1 is ambiguous: is it specified based on a Sunday or Monday, or from the first day of the year? This should all be specified and tested on the different systems where the R code will run. I'm not certain that Windows and POSIX systems sing the same tune on some of these conversions, hence I'd test and test again.

    0 讨论(0)
  • 2020-11-29 09:21

    I did not come up with this myself (it's taken from a blog post by Forester), but nevertheless I thought I'd add this to the answer list because it's the first implementation of the ISO 8601 week number convention that I've seen in R.

    No doubt, week numbers are a very ambiguous topic, but I prefer an ISO standard over the current implementation of week numbers via format(..., "%U") because it seems that this is what most people agreed on, at least in Germany (calendars etc.).

    I've put the actual function def at the bottom to facilitate focusing on the output first. Also, I just stumbled across package ISOweek, maybe worth a try.

    Approach Comparison

    x.days  <- c("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun")
    x.names <- sapply(1:length(posix), function(x) {
        x.day <- as.POSIXlt(posix[x], tz="Europe/Berlin")$wday
        if (x.day == 0) {
            x.day <- 7
        }
        out <- x.days[x.day]
    })
    
    data.frame(
        posix, 
        name=x.names,
        week.r=weeknum, 
        week.iso=ISOweek(as.character(posix), tzone="Europe/Berlin")$weeknum
    )
    
    # Result
    
            posix name week.r week.iso
    1  2012-01-01  Sun      1  4480458
    2  2012-01-02  Mon      1        1
    3  2012-01-03  Tue      1        1
    4  2012-01-04  Wed      1        1
    5  2012-01-05  Thu      1        1
    6  2012-01-06  Fri      1        1
    7  2012-01-07  Sat      1        1
    8  2012-01-08  Sun      2        1
    9  2012-01-09  Mon      2        2
    10 2012-01-10  Tue      2        2
    11 2012-01-11  Wed      2        2
    12 2012-01-12  Thu      2        2
    13 2012-01-13  Fri      2        2
    14 2012-01-14  Sat      2        2
    15 2012-01-15  Sun      3        2
    16 2012-01-16  Mon      3        3
    17 2012-01-17  Tue      3        3
    18 2012-01-18  Wed      3        3
    19 2012-01-19  Thu      3        3
    20 2012-01-20  Fri      3        3
    21 2012-01-21  Sat      3        3
    22 2012-01-22  Sun      4        3
    23 2012-01-23  Mon      4        4
    24 2012-01-24  Tue      4        4
    25 2012-01-25  Wed      4        4
    26 2012-01-26  Thu      4        4
    27 2012-01-27  Fri      4        4
    28 2012-01-28  Sat      4        4
    29 2012-01-29  Sun      5        4
    30 2012-01-30  Mon      5        5
    31 2012-01-31  Tue      5        5
    

    Function Def

    It's taken directly from the blog post, I've just changed a couple of minor things. The function is still kind of sketchy (e.g. the week number of the first date is far off), but I find it to be a nice start!

    ISOweek <- function(
        date, 
        format="%Y-%m-%d", 
        tzone="UTC", 
        return.val="weekofyear"
    ){
      ##converts dates into "dayofyear" or "weekofyear", the latter providing the ISO-8601 week
      ##date should be a vector of class Date or a vector of formatted character strings
      ##format refers to the date form used if a vector of
      ##  character strings  is supplied
    
      ##convert date to POSIXt format 
      if(class(date)[1]%in%c("Date","character")){
        date=as.POSIXlt(date,format=format, tz=tzone)
      }
    
    #  if(class(date)[1]!="POSIXt"){
      if (!inherits(date, "POSIXt")) {
        print("Date is of wrong format.")
        break
      }else if(class(date)[2]=="POSIXct"){
        date=as.POSIXlt(date, tz=tzone)
      }
    print(date)
    
      if(return.val=="dayofyear"){
        ##add 1 because POSIXt is base zero
        return(date$yday+1)
      }else if(return.val=="weekofyear"){
        ##Based on the ISO8601 weekdate system,
        ## Monday is the first day of the week
        ## W01 is the week with 4 Jan in it.
        year=1900+date$year
        jan4=strptime(paste(year,1,4,sep="-"),format="%Y-%m-%d")
        wday=jan4$wday
    
        wday[wday==0]=7  ##convert to base 1, where Monday == 1, Sunday==7
    
        ##calculate the date of the first week of the year
        weekstart=jan4-(wday-1)*86400  
        weeknum=ceiling(as.numeric((difftime(date,weekstart,units="days")+0.1)/7))
    
        #########################################################################
        ##calculate week for days of the year occuring in the next year's week 1.
        #########################################################################
        mday=date$mday
        wday=date$wday
        wday[wday==0]=7
        year=ifelse(weeknum==53 & mday-wday>=28,year+1,year)
        weeknum=ifelse(weeknum==53 & mday-wday>=28,1,weeknum)
    
        ################################################################
        ##calculate week for days of the year occuring prior to week 1.
        ################################################################
    
        ##first calculate the numbe of weeks in the previous year
        year.shift=year-1
        jan4.shift=strptime(paste(year.shift,1,4,sep="-"),format="%Y-%m-%d")
        wday=jan4.shift$wday
        wday[wday==0]=7  ##convert to base 1, where Monday == 1, Sunday==7
        weekstart=jan4.shift-(wday-1)*86400
        weeknum.shift=ceiling(as.numeric((difftime(date,weekstart)+0.1)/7))
    
        ##update year and week
        year=ifelse(weeknum==0,year.shift,year)
        weeknum=ifelse(weeknum==0,weeknum.shift,weeknum)
    
        return(list("year"=year,"weeknum"=weeknum))
      }else{
        print("Unknown return.val")
        break
      }
    }
    
    0 讨论(0)
提交回复
热议问题