calculating number of days between 2 columns of dates in data frame

后端 未结 5 1452
轮回少年
轮回少年 2020-12-02 12:56

I have a data frame which has two columns of dates in the format yyyy/mm/dd. I am trying to calculate the number of days between these two dates for each observation within

相关标签:
5条回答
  • 2020-12-02 13:25

    In Ronald's example, if the date formats are different (as displayed below) then modify the format parameter

    survey <- data.frame(date=c("2012-07-26","2012-07-25"),tx_start=c("2012-01-01","2012-01-01"))
    
    survey$date_diff <- as.Date(as.character(survey$date), format="%Y-%m-%d")-
                  as.Date(as.character(survey$tx_start), format="%Y-%m-%d")
    

    survey:

       date      tx_start     date_diff
    1 2012-07-26     2012-01-01    207 days
    2 2012-07-25     2012-01-01    206 days
    
    0 讨论(0)
  • 2020-12-02 13:42

    You need to use the as.Date formats correctly.

    Eg.

    x = '2012/07/25'
    xd = as.Date(x,'%Y/%m/%d')
    xd    # Prints "2012-07-25"
    

    R date formats are similary to *nix ones.

    Doing a typeof(xd) shows it as a double ie. days since 1970.

    0 讨论(0)
  • 2020-12-02 13:45

    Without your seeing your data (you can use the output of dput(head(survey)) to show us) this is a shot in the dark:

    survey <- data.frame(date=c("2012/07/26","2012/07/25"),tx_start=c("2012/01/01","2012/01/01"))
    
    survey$date_diff <- as.Date(as.character(survey$date), format="%Y/%m/%d")-
                      as.Date(as.character(survey$tx_start), format="%Y/%m/%d")
    survey
           date   tx_start date_diff
    1 2012/07/26 2012/01/01  207 days
    2 2012/07/25 2012/01/01  206 days
    
    0 讨论(0)
  • 2020-12-02 13:50

    You could find the difference between dates in columns in a data frame by using the function difftime as follows:

    df$diff_in_days<- difftime(df$datevar1 ,df$datevar2 , units = c("days"))
    
    0 讨论(0)
  • 2020-12-02 13:51

    Following Ronald Example I would like to add that it should be considered if the origin and end dates must be included or not in the days count between two dates. I faced the same problem and ended up using a third option with apply. It could be memory inefficient but helps to understand the problem:

       survey <- data.frame(date=c("2012/07/26","2012/07/25"),tx_start=c("2012/01/01","2012/01/01"))
    
    survey$diff_1 <- as.numeric(
      as.Date(as.character(survey$date), format="%Y/%m/%d")-
        as.Date(as.character(survey$tx_start), format="%Y/%m/%d")
    )
    
    survey$diff_2<- as.numeric(
      difftime(survey$date ,survey$tx_start , units = c("days"))
    )
    
    survey$diff_3 <- apply(X = survey[,c("date", "tx_start")],
                           MARGIN = 1,
                           FUN = function(x)
                             length(
                               seq.Date(
                                 from = as.Date(x[2]),
                                 to = as.Date(x[1]),
                                 by = "day")
                               )
                           )
    

    This gives the following date differences:

            date   tx_start diff_1   diff_2 diff_3
    1 2012/07/26 2012/01/01    207 206.9583    208
    2 2012/07/25 2012/01/01    206 205.9583    207
    
    0 讨论(0)
提交回复
热议问题