Determining at most 1 hour time difference between car and non-car mode

前端 未结 1 1382
一生所求
一生所求 2021-01-27 02:15

I have

 household       person     time           mode
      1           1          07:45:00        non-car
      1           1          09:05:00         car
            


        
1条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-27 02:30

    Here's a dplyr approach that produces those matches.

    library(dplyr); library(hms)
    df %>%
      # Connect the table to itself, linking by household.
      #   So every row gets linked to every row (including itself)
      #   with the same household. The original data with end .x and 
      #   the joined data will end .y, so we can compare then below.
      left_join(df, by = c("household")) %>%
      # Find the difference in time, in seconds
      mutate(time_dif = abs(time.y - time.x)) %>%
      filter(time_dif < 3600,       # Keep if <1hr difference
             person.x != person.y,  # Keep if different person
             mode.x != mode.y) %>%  # Keep if different mode
    
      # We have the answers now, everything below is for formatting
    
      # Rename and hide some variables we don't need any more
      select(household, person = person.x, time = time.x, 
             mode = mode.x, other = person.y) %>%
      # Combine each person's overlaps into one row
      group_by(household, person, time) %>%
      summarise(overlaps  = paste(other, collapse =","), times = length(other)) %>%
      # Add back all original rows, even if no overlaps
      right_join(df) %>%
      ungroup()
    
    
    ## A tibble: 7 x 6
    #  household person time   overlaps times mode   
    #         

    0 讨论(0)
提交回复
热议问题