Find all sequences with the same column value

后端 未结 9 804
误落风尘
误落风尘 2020-12-17 18:35

I have the following data frame:

╔══════╦═════════╗
║ Code ║ Airline ║
╠══════╬═════════╣
║    1 ║ AF      ║
║    1 ║ KL      ║
║    8 ║ AR      ║
║    8 ║ A         


        
9条回答
  •  醉梦人生
    2020-12-17 19:06

    You can do this quickly with tidyr's nest (although unless you first translate Airline as factor to character it's less quick) and merge

     library(tidyr)
     dat$Airline <- as.character(dat$Airline)
     new_dat <- merge(dat, dat %>% nest(-Code, .key= SharedWith), by="Code")
    

    and

    > new_dat
      Code Airline SharedWith
    1    1      AF     AF, KL
    2    1      KL     AF, KL
    3    8      AR AR, AZ, DL
    4    8      AZ AR, AZ, DL
    5    8      DL AR, AZ, DL
    

    an advantage of this solution over some of the others: SharedWith becomes a list-column of data.frame rather than say a character

    > str(new_dat$SharedWith)
    List of 5
     $ :'data.frame':   2 obs. of  1 variable:
      ..$ Airline: chr [1:2] "AF" "KL"
     $ :'data.frame':   2 obs. of  1 variable:
      ..$ Airline: chr [1:2] "AF" "KL"
     $ :'data.frame':   3 obs. of  1 variable:
      ..$ Airline: chr [1:3] "AR" "AZ" "DL"
     $ :'data.frame':   3 obs. of  1 variable:
      ..$ Airline: chr [1:3] "AR" "AZ" "DL"
     $ :'data.frame':   3 obs. of  1 variable:
      ..$ Airline: chr [1:3] "AR" "AZ" "DL"
    

    so you can then easily (albiet not prettily) index out vectors of the shared values, like:

    > new_dat$SharedWith[[1]]$Airline
    [1] "AF" "KL"
    

    rather than having to use strsplit or similar

提交回复
热议问题