compare two pandas data frame

匿名 (未验证) 提交于 2019-12-03 08:54:24

问题:

I have two pandas dataframes defined as such:

_data_orig = [     [1, "Bob", 3.0],     [2, "Sam", 2.0],     [3, "Jane", 4.0] ] _columns = ["ID", "Name", "GPA"]  _data_new = [         [1, "Bob", 3.2],         [3, "Jane", 3.9],         [4, "John", 1.2],         [5, "Lisa", 2.2]     ] _columns = ["ID", "Name", "GPA"]  df1 = pd.DataFrame(data=_data_orig, columns=_columns) df2 = pd.DataFrame(data=_data_new, columns=_columns) 

I need to find the following information:

  • Find deletes where df1 is the original data set and df2 is the new data set
  • I need to find the row changes for existing record between the two. Example ID == 1 should compare df2's ID == 1 to see if any column value changed for each row.
  • Find any adds to df2 verse df1. Example return [4, "John", 1.2] and [5, "Lisa", 2.2]

For operation to find changes in rows, I figured I could look through df2 and check df1, but that seems slow, so I'm hoping to find a faster solution there.

For the other two operations, I really do not know what to do because when I try to compare the two dataframes I get:

ValueError: Can only compare identically-labeled DataFrame objects

Pandas version: '0.16.1'

Suggestions?

回答1:

setup

m = df1.merge(df2, on=['ID', 'Name'], how='outer', suffixes=['', '_'], indicator=True) m 

adds

m.loc[m._merge.eq('right_only')]
or
m.query('_merge == "right_only"')

deletes

m.loc[m._merge.eq('left_only')]
or
m.query('_merge == "left_only"')


0.16.1 answer

setup

m = df1.merge(df2, on=['ID', 'Name'], how='outer', suffixes=['', '_']) m 

adds

m.loc[m.GPA_.notnull() & m.GPA.isnull()] 

deletes

m.loc[m.GPA_.isnull() & m.GPA.notnull()] 



回答2:

doing this:

df1.set_index(['Name','ID'])-df2.set_index(['Name','ID']) Out[108]:              GPA Name ID         Bob  1  -0.2000 Jane 3   0.1000 John 4      nan Lisa 5      nan Sam  2      nan 

would allow you to screen if there is difference between df1 and df2. NaN would represent values that does not intersect



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!