R / lubridate: Calculate number of overlapping days between two periods

后端 未结 2 625
予麋鹿
予麋鹿 2021-01-16 11:56

I am trying to calculate the number of overlapping days between two time periods. One period is fixed in a start and end date, the other is recorded as start and end dates i

相关标签:
2条回答
  • 2021-01-16 12:20

    We can use pmin/pmax to get the min/max of two sets of vectors

    df %>% 
       mutate(overlap = ifelse(my.start > end, 0, pmin(my.end, end) - 
                                     pmax(my.start, start) + 1))
    #       start        end overlap
    #1 2018-07-15 2018-07-20   0
    #2 2018-07-20 2018-08-05   5
    #3 2018-08-15 2018-08-19   5
    #4 2018-08-20 2018-09-15  12
    #5 2018-09-01 2018-09-15   0
    

    If we want to use the same option as in the OP's code, i.e. min/max, either with rowwise() or using map2, we loop through rows

    library(purrr)
    df %>% 
      mutate(overlap = map2_dbl(start, end, ~
            max( as.integer(min(my.end, .y) - max(my.start, .x) + 1), 0)))
    

    Noticed that the OP's actual data have time component. In that case, change the above solution by converting to Date class

    df %>% 
       mutate(overlap = map2_dbl(start, end, ~
         max(as.integer(min(my.end, as.Date(.y)) - max(my.start, as.Date(.x)) + 1), 0)))
    
    0 讨论(0)
  • 2021-01-16 12:22

    I think you may be running into issues with max and min vs pmax and pmin:

    library(dplyr)
    
    df %>%
      mutate(overlap = pmax(pmin(my.end, end) - pmax(my.start, start) + 1,0))
    
           start        end overlap
    1 2018-07-15 2018-07-20  0 days
    2 2018-07-20 2018-08-05  5 days
    3 2018-08-15 2018-08-19  5 days
    4 2018-08-20 2018-09-15 12 days
    5 2018-09-01 2018-09-15  0 days
    
    0 讨论(0)
提交回复
热议问题