Grouping by multiple columns to find duplicate rows pandas

后端 未结 1 854
南旧
南旧 2021-02-05 17:47

I have a df

id    val1    val2
 1     1.1      2.2
 1     1.1      2.2
 2     2.1      5.5
 3     8.8      6.2
 4     1.1      2.2
 5     8.8      6.         


        
相关标签:
1条回答
  • 2021-02-05 18:23

    You need duplicated with parameter subset for specify columns for check with keep=False for all duplicates for mask and filter by boolean indexing:

    df = df[df.duplicated(subset=['val1','val2'], keep=False)]
    print (df)
       id  val1  val2
    0   1   1.1   2.2
    1   1   1.1   2.2
    3   3   8.8   6.2
    4   4   1.1   2.2
    5   5   8.8   6.2
    

    Detail:

    print (df.duplicated(subset=['val1','val2'], keep=False))
    0     True
    1     True
    2    False
    3     True
    4     True
    5     True
    dtype: bool
    
    0 讨论(0)
提交回复
热议问题