In a pandas dataframe, how can I drop a random subset of rows that obey a condition?
In other words, if I have a Pandas dataframe w
Using drop
with sample
df.drop(df[df.Label.eq(1)].sample(2).index)
Label A
0 0 1
1 0 2
2 0 3
3 1 10
5 1 12
Use the frac
argument
df.sample(frac=.5)
If you define the amount you want to drop in a variable n
n = .5
df.sample(frac=1 - n)
To include the condition, use drop
df.drop(df.query('Label == 1').sample(frac=.5).index)
Label A
0 0 1
1 0 2
2 0 3
4 1 11
6 1 13