How to drop rows that are not exact duplicates but contain no new information (more NaN)

问题

My goal is to collapse the below table into one single column. For this question specifically, I am asking how I can intelligently delete the yellow row because it is a duplicate of the gray row, although with less information.

The table has three categorical variables and 6 analysis/quantitative variables. Columns C1 and C2 are the only variables that need to match for a successful join; all of the . All blank cells are NaNs and python code for copying is below.

Question 1. (Yellow) All of the quantitative information stored in the yellow row is also stored in the grey row. The grey row has more information. Is there a way to intelligently delete a row of this type, similar to the Pandas drop_duplicates function? A hypothetical option would be df.drop_duplicates(subset=df.columns[4:], ignoreNaNs=True)

Related Question (Blue) How to join two rows that have the same keys and complementary values

Data table

Current Progress

My current code includes this line to drop all rows where all quantitative variables are NaN.
df.dropna(subset=df.columns[4:],how='all', inplace=True)

Also, this line for deleting all rows where all quantitative variables are the same.
df.drop_duplicates(subset=df.columns[4:], inplace=True)

Example code that can be copied into an IDE.

import pandas as pd

dfO = [['S1','P3','H1',Timestamp('2004-12-04 00:00:00'),-15.0,-27.4,nan,-10.0,-15.0,nan],
 ['S1','P3','H1',Timestamp('2004-12-20 00:00:00'),nan,nan,nan,nan,nan,nan],
 ['S1','P3','H2',Timestamp('2004-12-20 00:00:00'),-15.0,nan,nan,-10.0,nan,nan],
 ['S1','P3','H3',Timestamp('2004-12-07 00:00:00'),nan,nan,nan,nan,-15.0,-8.0],
 ['S1','P3','H1', Timestamp('2004-12-04 00:00:00'), -15.0,-27.4,nan,-10.0, -15.0, nan]]
cols = ['C1 (PK)', 'C2 (FK)', 'C3', 'C4', 'Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6']
df = pd.DataFrame(data=dfO,columns=cols)

df.drop_duplicates(inplace=True)
df.dropna(subset=df.columns[4:],how='all', inplace=True)
df.drop_duplicates(subset=df.columns[4:], inplace=True)

来源：https://stackoverflow.com/questions/59772372/how-to-drop-rows-that-are-not-exact-duplicates-but-contain-no-new-information-m

标签

pandas

join

data-cleaning

drop-duplicates