In a pandas dataframe, how can I drop a random subset of rows that obey a condition?
In other words, if I have a Pandas dataframe w
Use the frac argument
frac
df.sample(frac=.5)
If you define the amount you want to drop in a variable n
n
n = .5 df.sample(frac=1 - n)
To include the condition, use drop
drop
df.drop(df.query('Label == 1').sample(frac=.5).index) Label A 0 0 1 1 0 2 2 0 3 4 1 11 6 1 13