Get Rows based on distinct values from Column 2

后端未结

关注

 1  1929

I am a newbie to pandas, tried searching this on google but still no luck. How can I get the rows by distinct values in column2?

For example, I have the dataframe be

相关标签:

1条回答

萌比男神i

2020-11-22 11:34

Use drop_duplicates with specifying column COL2 for check duplicates:

df = df.drop_duplicates('COL2')
#same as
#df = df.drop_duplicates('COL2', keep='first')
print (df)
    COL1  COL2
0  a.com    22
1  b.com    45
2  c.com    34
4  f.com    56

You can also keep only last values:

df = df.drop_duplicates('COL2', keep='last')
print (df)
    COL1  COL2
2  c.com    34
4  f.com    56
5  g.com    22
6  h.com    45

Or remove all duplicates:

df = df.drop_duplicates('COL2', keep=False)
print (df)
    COL1  COL2
2  c.com    34
4  f.com    56

0 讨论(0)