I want to drop duplicates and keep the first value. The duplicates that want to be dropped is A = \'df\' .Here\'s my data
A B C D E
qw 1 3 1 1
er
Create helper Series
for distinguish consecutive values in A
column and then filter by boolean indexing with inverted (~)
boolean mask created by duplicated chained with another mask for compare value df
:
s = df['A'].ne(df['A'].shift()).cumsum()
df = df[~((df['A'] == 'df') & (s.duplicated()))]
print (df)
A B C D E
0 qw 1 3 1 1
1 er 2 4 2 6
2 ew 4 8 44 4
3 df 34 34 34 34
7 we 2 5 5 2
8 we 4 4 4 4
9 df 34 9 34 34
11 we 4 7 4 4
12 qw 2 2 7 2