Pandas : remove SOME duplicate values based on conditions
问题 I have a dataset : id url keep_if_dup 1 A.com Yes 2 A.com Yes 3 B.com No 4 B.com No 5 C.com No I want to remove duplicates, i.e. keep first occurence of "url" field, BUT keep duplicates if the field "keep_if_dup" is YES. Expected output : id url keep_if_dup 1 A.com Yes 2 A.com Yes 3 B.com No 5 C.com No What I tried : Dataframe=Dataframe.drop_duplicates(subset='url', keep='first') which of course does not take into account "keep_if_dup" field. Output is : id url keep_if_dup 1 A.com Yes 3 B.com