Extracting event types from last 21 day window

后端 未结 2 1816
迷失自我
迷失自我 2021-01-20 22:01

My dataframe looks like this. The two rightmost columns are my desired columns.

**Name      ActivityType     ActivityDate   Email(last 21 says)  Webinar(last         


        
2条回答
  •  天涯浪人
    2021-01-20 22:13

    Here is another option with base R:

    df is first split according to Name and then, among each subset, for each Sale, it looks if there is an Email (Webinar) within 21 days from the Sale. Finally, the list is unsplit according to Name.
    You just have to replace FALSE by no and TRUE by yes afterwards.

    df_split <- split(df, df$Name)
    
    df_split <- lapply(df_split, function(tab){
                                    i_s <- which(tab[,2]=="Sale")
                                    tab$Email21[i_s] <- sapply(tab[i_s, 3], function(d_s){any(tab[tab$ActivityType=="Email", 3] >= d_s-21)})
                                    tab$Webinar21[i_s] <- sapply(tab[i_s, 3], function(d_s){any(tab[tab$ActivityType=="Webinar", 3] >= d_s-21)})
                                    tab
                                  })
    df_res <- unsplit(df_split, df$Name)
    
    df_res
    #   Name ActivityType ActivityDate Email21 Webinar21
    #1  John        Email   2014-01-01      NA        NA
    #2  John      Webinar   2014-01-05      NA        NA
    #3  John         Sale   2014-01-20    TRUE      TRUE
    #4  John      Webinar   2014-03-25      NA        NA
    #5  John         Sale   2014-04-01   FALSE      TRUE
    #6  John         Sale   2014-07-01   FALSE     FALSE
    #7   Tom        Email   2015-01-01      NA        NA
    #8   Tom      Webinar   2015-01-05      NA        NA
    #9   Tom         Sale   2015-01-20    TRUE      TRUE
    #10  Tom      Webinar   2015-03-25      NA        NA
    #11  Tom         Sale   2015-04-01   FALSE      TRUE
    #12  Tom         Sale   2015-07-01   FALSE     FALSE
    

    data

    df <- structure(list(Name = c("John", "John", "John", "John", "John", 
    "John", "Tom", "Tom", "Tom", "Tom", "Tom", "Tom"), ActivityType = c("Email", 
    "Webinar", "Sale", "Webinar", "Sale", "Sale", "Email", "Webinar", 
    "Sale", "Webinar", "Sale", "Sale"), ActivityDate = structure(c(16071, 
    16075, 16090, 16154, 16161, 16252, 16436, 16440, 16455, 16519, 
    16526, 16617), class = "Date")), .Names = c("Name", "ActivityType", 
    "ActivityDate"), row.names = c(NA, -12L), index = structure(integer(0), ActivityType = c(1L, 
    7L, 3L, 5L, 6L, 9L, 11L, 12L, 2L, 4L, 8L, 10L)), class = "data.frame")
    

提交回复
热议问题