Pandas drop duplicates; values in reverse order

后端 未结 1 423
暖寄归人
暖寄归人 2021-01-15 17:29

I\'m trying to find a way to utilize pandas drop_duplicates() to recognize that rows are duplicates when the values are in reverse order.

An example is

相关标签:
1条回答
  • 2021-01-15 17:55

    First sort by rows with apply sorted and then drop_duplicates:

    df = df.apply(sorted, axis=1).drop_duplicates()
    print (df)
       Item1   Item2
    0  Apple  Banana
    

    #if need specify columns
    cols = ['Item1','Item2']
    df[cols] = df[cols].apply(sorted, axis=1)
    df = df.drop_duplicates(subset=cols)
    print (df)
       Item1   Item2
    0  Apple  Banana
    

    Another solution with numpy.sort and DataFrame constructor:

    df = pd.DataFrame(np.sort(df.values, axis=1), index=df.index, columns=df.columns)
           .drop_duplicates()
    print (df)
       Item1   Item2
    0  Apple  Banana
    
    0 讨论(0)
提交回复
热议问题