问题
Suppose I have a MNIST dataset in this way.
df = pd.read_csv('data/train.csv')
data = df.loc[df['label'].isin([1,6])]
I am trying to select only those rows whose column ['label'] == 1 or 6.
But, I am want to get only 500 rows from each column ['label']
How do I do it?
回答1:
Use groupby first then filer i.e
ndf= df.groupby('label').head(500)
data = ndf.loc[ndf['label'].isin([1,6])]
回答2:
You can group them and select the number you want for each value:
data = df.loc[df['label'].isin([1,6])].groupby('label').head(500)
来源:https://stackoverflow.com/questions/46860038/how-to-get-specific-number-of-rows-based-on-column-values-in-dataframe