How to get specific number of rows based on column values in dataframe [duplicate]

只谈情不闲聊 提交于 2020-01-16 08:36:07

问题


Suppose I have a MNIST dataset in this way.

df = pd.read_csv('data/train.csv')
data = df.loc[df['label'].isin([1,6])]

I am trying to select only those rows whose column ['label'] == 1 or 6.

But, I am want to get only 500 rows from each column ['label']

How do I do it?


回答1:


Use groupby first then filer i.e

ndf= df.groupby('label').head(500)
data = ndf.loc[ndf['label'].isin([1,6])]



回答2:


You can group them and select the number you want for each value:

data = df.loc[df['label'].isin([1,6])].groupby('label').head(500)


来源:https://stackoverflow.com/questions/46860038/how-to-get-specific-number-of-rows-based-on-column-values-in-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!