Compare Multiple Columns to Get Rows that are Different in Two Pandas Dataframes

后端 未结 4 754
悲哀的现实
悲哀的现实 2021-01-14 16:38

I have two dataframes:

df1=
    A    B   C
0   A0   B0  C0
1   A1   B1  C1
2   A2   B2  C2

df2=
    A    B   C
0   A2   B2  C10
1   A1   B3  C11
2   A9   B4         


        
4条回答
  •  囚心锁ツ
    2021-01-14 17:22

    Method ( 1 )


    In [63]:
    df1['A'].isin(df2['A']) & df1['B'].isin(df2['B'])
    Out[63]:
    
    0   False
    1   False
    2   True
    

    Method ( 2 )


    you can use the left merge to obtain values that exist in both frames + values that exist in the first data frame only

    In [10]:
    left = pd.merge(df1 , df2 , on = ['A' , 'B'] ,how = 'left')
    left
    Out[10]:
        A   B   C_x C_y
    0   A0  B0  C0  NaN
    1   A1  B1  C1  NaN
    2   A2  B2  C2  C10
    

    then of course values that exist only in the first frame will have NAN values in columns of the other data frame , then you can filter by this NAN values by doing the following

    In [16]:
    left.loc[pd.isnull(left['C_y']) , 'A':'C_x']
    Out[16]:
        A   B   C_x
    0   A0  B0  C0
    1   A1  B1  C1
    
    In [17]:
    

    if you want to get whether the values in A exists in B you can do the following

    In [20]:
    pd.notnull(left['C_y'])
    Out[20]:
    0    False
    1    False
    2     True
    Name: C_y, dtype: bool
    

提交回复
热议问题