Equality in Pandas DataFrames - Column Order Matters?

后端 未结 8 1924
说谎
说谎 2020-12-13 19:24

As part of a unit test, I need to test two DataFrames for equality. The order of the columns in the DataFrames is not important to me. However, it seems to matter to Panda

相关标签:
8条回答
  • 2020-12-13 20:16

    Sorting column only works if the row and column labels match across the frames. Say, you have 2 dataframes with identical values in cells but with different labels,then the sort solution will not work. I ran into this scenario when implementing k-modes clustering using pandas.

    I got around it with a simple equals function to check cell equality(code below)

    def frames_equal(df1,df2) :
        if not isinstance(df1,DataFrame) or not isinstance(df2,DataFrame) :
            raise Exception(
                "dataframes should be an instance of pandas.DataFrame")
    
        if df1.shape != df2.shape:
            return False
    
        num_rows,num_cols = df1.shape
        for i in range(num_rows):
           match = sum(df1.iloc[i] == df2.iloc[i])
           if match != num_cols :
              return False
       return True
    
    0 讨论(0)
  • 2020-12-13 20:17

    The most common intent is handled like this:

    def assertFrameEqual(df1, df2, **kwds ):
        """ Assert that two dataframes are equal, ignoring ordering of columns"""
        from pandas.util.testing import assert_frame_equal
        return assert_frame_equal(df1.sort_index(axis=1), df2.sort_index(axis=1), check_names=True, **kwds )
    

    Of course see pandas.util.testing.assert_frame_equal for other parameters you can pass

    0 讨论(0)
提交回复
热议问题