How to find dropped data after using Pandas merge in python?

后端 未结 2 407
误落风尘
误落风尘 2021-01-14 14:31

My Dataframe looks like following.I am using Pandas merge function to merge two dataframes, and I am trying to find row that was dropped. Is there a way in Pandas or python

相关标签:
2条回答
  • 2021-01-14 15:03
    merge = pd.merge(df1,df2,on='Name', indicator=True, how='outer')
    print (merge)
    #drop dataframe
    del df1
    del df2
    
    0 讨论(0)
  • 2021-01-14 15:17

    Use merge with outer join and parameter indicator=True:

    df = pd.merge(df1,df2,on='Name', indicator=True, how='outer')
    print (df)
      Name   Age  Add      _merge
    0    A  34.0   rt        both
    1    B  23.0   ct        both
    2    C  90.0  NaN   left_only
    3    D   NaN   pt  right_only
    

    Last filter no both rows by boolean indexing:

    print (df[df['_merge'] != 'both'])
      Name   Age  Add      _merge
    2    C  90.0  NaN   left_only
    3    D   NaN   pt  right_only
    

    Another solution is filtering with isin and inverting mask by ~:

    print (df1[~df1['Name'].isin(df2['Name'])])
      Name  Age
    2    C   90
    
    print (df2[~df2['Name'].isin(df1['Name'])])
      Name Add
    2    D  pt
    
    0 讨论(0)
提交回复
热议问题