I tried all the solutions here: Pandas "Can only compare identically-labeled DataFrame objects" error
Didn\'t work for me. Here\'s what I\'ve got. I hav
Replicated with some fake data to achieve the end goal of removing duplicates. Note this is not the answer to the original question, but what the answer was to what I was attempting to do that caused the question.
b = pd.DataFrame({'A': ['A4', 'A5', 'A6', 'A7'],
'B': ['B4', 'B5', 'B6', 'B7'],
'C': ['C4', 'C5', 'C6', 'C7'],
'D': ['D4', 'D5', 'D6', 'D7']},
index=[4, 5, 6, 7])
c = pd.DataFrame({'A': ['A7', 'A8', 'A9', 'A10', 'A11'],
'A': ['A7', 'A8', 'A9', 'A10', 'A11'],
'B': ['B7', 'B8', 'B9', 'B10', 'B11'],
'C': ['C7', 'C8', 'C9', 'C10', 'C11'],
'D': ['D7', 'D8', 'D9', 'D10', 'D11']},
index=[7, 8, 9, 10, 11])
result = pd.concat([b,c])
idx = np.unique(result["A"], return_index=True)[1]
result.iloc[idx].sort()
If you want to compare 2 Data Frames. Check-out flexible comparison in Pandas, using the methods like .eq(), .nq(), gt() and more... --> equal, not equal and greater then.
Example:
df['new_col'] = df.gt(df_1)
http://pandas.pydata.org/pandas-docs/stable/basics.html#flexible-comparisons
In order to get around this, you want to compare the underlying numpy arrays.
import pandas as pd
df1 = pd.DataFrame([[1, 2], [3, 4]], columns=['A', 'B'], index=['One', 'Two'])
df2 = pd.DataFrame([[1, 2], [3, 4]], columns=['a', 'b'], index=['one', 'two'])
df1.values == df2.values
array([[ True, True],
[ True, True]], dtype=bool)