Calculate ages in R

前端 未结 8 1764
北荒
北荒 2020-11-27 16:39

I have two data frames in R. One frame has a persons year of birth:

YEAR
/1931
/1924

and then another column shows a more recent time.

相关标签:
8条回答
  • 2020-11-27 17:11

    The following function takes a vectors of Date objects and calculates the ages, correctly accounting for leap years. Seems to be a simpler solution than any of the other answers.

    age = function(from, to) {
      from_lt = as.POSIXlt(from)
      to_lt = as.POSIXlt(to)
    
      age = to_lt$year - from_lt$year
    
      ifelse(to_lt$mon < from_lt$mon |
             (to_lt$mon == from_lt$mon & to_lt$mday < from_lt$mday),
             age - 1, age)
    }
    
    0 讨论(0)
  • 2020-11-27 17:11

    You can solve this with the lubridate package.

    > library(lubridate)
    

    I don't think /1931 is a common date class. So I'll assume all the entries are character strings.

    > RECENT <- data.frame(recent = c("09/08/2005", "11/08/2005"))
    > YEAR <- data.frame(year = c("/1931", "/1924"))
    

    First, let's notify R that the recent dates are dates. I'll assume the dates are in month/day/year order, so I use mdy(). If they're in day/month/year order just use dmy().

    > RECENT$recent <- mdy(RECENT$recent)
          recent
    1 2005-09-08
    2 2005-11-08
    

    Now, lets turn the years into numbers so we can do some math with them.

    > YEAR$year <- as.numeric(substr(YEAR$year, 2, 5))
    

    Now just do the math. year() extracts the year value of the RECENT dates.

    > year(RECENT$recent) - YEAR
      year
    1   74
    2   81
    

    p.s. if your year entries are actually full dates, you can get the difference in years with

    > YEAR1 <- data.frame(year = mdy("01/08/1931","01/08/1924"))
    > as.period(RECENT$recent - YEAR1$year, units = "year")
    [1] 74 years and 8 months   81 years and 10 months
    
    0 讨论(0)
  • 2020-11-27 17:20

    Given the data in your example:

    > m <- data.frame(YEAR=c("/1931", "/1924"),RECENT=c("09/08/2005","11/08/2005"))
    > m
       YEAR     RECENT
    1 /1931 09/08/2005
    2 /1924 11/08/2005
    

    Extract year with the strptime function:

    > strptime(m[,2], format = "%m/%d/%Y")$year - strptime(m[,1], format = "/%Y")$year
    [1] 74 81
    
    0 讨论(0)
  • 2020-11-27 17:22

    I use a custom function, see code below, convenient to use in mutate and quite flexible (you'll need the lubridate package).

    Examples

    get_age("2000-01-01")
    # [1] 17
    get_age(lubridate::as_date("2000-01-01"))
    # [1] 17
    get_age("2000-01-01","2015-06-15")
    # [1] 15
    get_age("2000-01-01",dec = TRUE)
    # [1] 17.92175
    get_age(c("2000-01-01","2003-04-12"))
    # [1] 17 14
    get_age(c("2000-01-01","2003-04-12"),dec = TRUE)
    # [1] 17.92176 14.64231
    

    Function

    #' Get age
    #' 
    #' Returns age, decimal or not, from single value or vector of strings
    #' or dates, compared to a reference date defaulting to now. Note that
    #' default is NOT the rounded value of decimal age.
    #' @param from_date vector or single value of dates or characters
    #' @param to_date date when age is to be computed
    #' @param dec return decimal age or not
    #' @examples
    #' get_age("2000-01-01")
    #' get_age(lubridate::as_date("2000-01-01"))
    #' get_age("2000-01-01","2015-06-15")
    #' get_age("2000-01-01",dec = TRUE)
    #' get_age(c("2000-01-01","2003-04-12"))
    #' get_age(c("2000-01-01","2003-04-12"),dec = TRUE)
    get_age <- function(from_date,to_date = lubridate::now(),dec = FALSE){
      if(is.character(from_date)) from_date <- lubridate::as_date(from_date)
      if(is.character(to_date))   to_date   <- lubridate::as_date(to_date)
      if (dec) { age <- lubridate::interval(start = from_date, end = to_date)/(lubridate::days(365)+lubridate::hours(6))
      } else   { age <- lubridate::year(lubridate::as.period(lubridate::interval(start = from_date, end = to_date)))}
      age
    }
    
    0 讨论(0)
  • 2020-11-27 17:24

    You can do some formating:

    as.numeric(format(as.Date("01/01/2010", format="%m/%d/%Y"), format="%Y")) - 1930
    

    With your data:

    > yr <- c(1931, 1924)
    > recent <- c("09/08/2005", "11/08/2005")
    > as.numeric(format(as.Date(recent, format="%m/%d/%Y"), format="%Y")) - yr
    [1] 74 81
    

    Since you have your data in a data.frame (I'll assume that it's called df), it will be more like this:

    as.numeric(format(as.Date(df$recent, format="%m/%d/%Y"), format="%Y")) - df$year
    
    0 讨论(0)
  • 2020-11-27 17:24

    Really solid way that also supports vectors using the lubridate package:

    age <- function(date.birth, date.ref = Sys.Date()) {
      if (length(date.birth) > 1 & length(date.ref) == 1) {
        date.ref <- rep(date.ref, length(date.birth))
      }
    
      date.birth.monthdays <- paste0(month(date.birth), day(date.birth)) %>% as.integer()
      date.ref.monthdays <- paste0(month(date.ref), day(date.ref)) %>% as.integer()
    
      age.calc <- 0
    
      for (i in 1:length(date.birth)) {
        if (date.birth.monthdays[i] <= date.ref.monthdays[i]) {
          # didn't had birthday
          age.calc[i] <- year(date.ref[i]) - year(date.birth[i])
        } else {
          age.calc[i] <- year(date.ref[i]) - year(date.birth[i]) - 1
        }
      }
      age.calc
    }
    

    This also accounts for leap years. I just check if someone has had a birthday already.

    0 讨论(0)
提交回复
热议问题