python panda: return indexes of common rows

后端 未结 2 888
庸人自扰
庸人自扰 2021-01-15 20:01

Apologies, if this is a fairly newbie question. I was trying to find which rows are common between two data frames. The return values should be the row indexes of df2

相关标签:
2条回答
  • 2021-01-15 20:37

    New column with join values is not necessary, merge by default inner merge by both columns and if need values of df2.index add reset_index:

    df1 = pd.DataFrame({'col1':['cx','cx','cx2'], 'col2':[1,4,12]})
    df2 = pd.DataFrame({'col1':['cx','cx','cx','cx','cx2','cx2'], 'col2':[1,3,5,10,12,12]})
    
    df3 = pd.merge(df1,df2.reset_index(), on = ['col1','col2'])
    print (df3)
      col1 col2  index
    0   cx    1      0
    1  cx2   12      4
    2  cx2   12      5
    

    For both indexes need:

    df4 = pd.merge(df1.reset_index(),df2.reset_index(), on = ['col1','col2'])
    print (df4)
    
       index_x col1  col2  index_y
    0        0   cx     1        0
    1        2  cx2    12        4
    2        2  cx2    12        5
    

    For only intersection of both DataFrames:

    df5 = pd.merge(df1,df2, on = ['col1','col2'])
    #if 2 column DataFrame   
    #df5 = pd.merge(df1,df2)
    print (df5)
    
      col1  col2
    0   cx     1
    1  cx2    12
    2  cx2    12
    
    0 讨论(0)
  • 2021-01-15 20:52

    This can easily be done by merging (inner join) both dataframes:

    common_rows = pd.merge(df1, df2.reset_index(), how='inner', on=['idx_values'])
    
    0 讨论(0)
提交回复
热议问题