Get Rows based on distinct values from Column 2

后端 未结 1 1929
不思量自难忘°
不思量自难忘° 2020-11-22 10:35

I am a newbie to pandas, tried searching this on google but still no luck. How can I get the rows by distinct values in column2?

For example, I have the dataframe be

相关标签:
1条回答
  • 2020-11-22 11:34

    Use drop_duplicates with specifying column COL2 for check duplicates:

    df = df.drop_duplicates('COL2')
    #same as
    #df = df.drop_duplicates('COL2', keep='first')
    print (df)
        COL1  COL2
    0  a.com    22
    1  b.com    45
    2  c.com    34
    4  f.com    56
    

    You can also keep only last values:

    df = df.drop_duplicates('COL2', keep='last')
    print (df)
        COL1  COL2
    2  c.com    34
    4  f.com    56
    5  g.com    22
    6  h.com    45
    

    Or remove all duplicates:

    df = df.drop_duplicates('COL2', keep=False)
    print (df)
        COL1  COL2
    2  c.com    34
    4  f.com    56
    
    0 讨论(0)
提交回复
热议问题