dplyr filtering on multiple columns using “%in%”

后端 未结 2 1190
悲&欢浪女
悲&欢浪女 2021-01-29 03:43

I have a dataframe (df1) with multiple columns (ID, Number, Location, Field, Weight). I also have another dataframe (df2) with more information (ID, PassRate, Number, Weight). <

2条回答
  •  后悔当初
    2021-01-29 04:13

    Try this,

    df1[paste0(df1$ID, df1$Weight) %in% paste0(df2$ID, df2$Weight), ]
    

    what you are doing is filter the df1 by df2 value , not find the row match

    Try this sample data

    df1 
    ID  Weight
    1   a
    2   b
    
    
    df2 
    ID  Weight
    1   b
    2   a
    

    Using your function

     df_sub <- subset(df1, df1$ID %in% df2$ID & df1$Weight %in% df2$Weight)
    
    
    > df_sub
      ID Weight
    1  2      b
    2  1      a
    

    Actually , it give back the Boolean like below which cause all df1 value show up on df2 :

     True  True
     True  True
    

    using mine, the result is no one match :

     df1[paste0(df1$ID, df1$Weight) %in% paste0(df2$ID, df2$Weight), ]
    
    [1] ID     Weight
    <0 rows> (or 0-length row.names)
    

提交回复
热议问题