My dataframe looks like this. The two rightmost columns are my desired columns.
**Name ActivityType ActivityDate Email(last 21 says) Webinar(last
Here is another option with base R
is first split according to Name
and then, among each subset, for each Sale
, it looks if there is an Email (Webinar) within 21 days from the Sale. Finally, the list is unsplit according to Name
You just have to replace FALSE
by no
and TRUE
by yes
df_split <- split(df, df$Name)
df_split <- lapply(df_split, function(tab){
i_s <- which(tab[,2]=="Sale")
tab$Email21[i_s] <- sapply(tab[i_s, 3], function(d_s){any(tab[tab$ActivityType=="Email", 3] >= d_s-21)})
tab$Webinar21[i_s] <- sapply(tab[i_s, 3], function(d_s){any(tab[tab$ActivityType=="Webinar", 3] >= d_s-21)})
df_res <- unsplit(df_split, df$Name)
# Name ActivityType ActivityDate Email21 Webinar21
#1 John Email 2014-01-01 NA NA
#2 John Webinar 2014-01-05 NA NA
#3 John Sale 2014-01-20 TRUE TRUE
#4 John Webinar 2014-03-25 NA NA
#5 John Sale 2014-04-01 FALSE TRUE
#6 John Sale 2014-07-01 FALSE FALSE
#7 Tom Email 2015-01-01 NA NA
#8 Tom Webinar 2015-01-05 NA NA
#9 Tom Sale 2015-01-20 TRUE TRUE
#10 Tom Webinar 2015-03-25 NA NA
#11 Tom Sale 2015-04-01 FALSE TRUE
#12 Tom Sale 2015-07-01 FALSE FALSE
df <- structure(list(Name = c("John", "John", "John", "John", "John",
"John", "Tom", "Tom", "Tom", "Tom", "Tom", "Tom"), ActivityType = c("Email",
"Webinar", "Sale", "Webinar", "Sale", "Sale", "Email", "Webinar",
"Sale", "Webinar", "Sale", "Sale"), ActivityDate = structure(c(16071,
16075, 16090, 16154, 16161, 16252, 16436, 16440, 16455, 16519,
16526, 16617), class = "Date")), .Names = c("Name", "ActivityType",
"ActivityDate"), row.names = c(NA, -12L), index = structure(integer(0), ActivityType = c(1L,
7L, 3L, 5L, 6L, 9L, 11L, 12L, 2L, 4L, 8L, 10L)), class = "data.frame")