I am trying to calculate the number of overlapping days between two time periods. One period is fixed in a start and end date, the other is recorded as start and end dates i
We can use pmin/pmax
to get the min/max
of two sets of vector
s
df %>%
mutate(overlap = ifelse(my.start > end, 0, pmin(my.end, end) -
pmax(my.start, start) + 1))
# start end overlap
#1 2018-07-15 2018-07-20 0
#2 2018-07-20 2018-08-05 5
#3 2018-08-15 2018-08-19 5
#4 2018-08-20 2018-09-15 12
#5 2018-09-01 2018-09-15 0
If we want to use the same option as in the OP's code, i.e. min/max
, either with rowwise()
or using map2
, we loop through rows
library(purrr)
df %>%
mutate(overlap = map2_dbl(start, end, ~
max( as.integer(min(my.end, .y) - max(my.start, .x) + 1), 0)))
Noticed that the OP's actual data have time component. In that case, change the above solution by converting to Date
class
df %>%
mutate(overlap = map2_dbl(start, end, ~
max(as.integer(min(my.end, as.Date(.y)) - max(my.start, as.Date(.x)) + 1), 0)))
I think you may be running into issues with max
and min
vs pmax
and pmin
:
library(dplyr)
df %>%
mutate(overlap = pmax(pmin(my.end, end) - pmax(my.start, start) + 1,0))
start end overlap
1 2018-07-15 2018-07-20 0 days
2 2018-07-20 2018-08-05 5 days
3 2018-08-15 2018-08-19 5 days
4 2018-08-20 2018-09-15 12 days
5 2018-09-01 2018-09-15 0 days