pandas: drop duplicates in groupby 'date'

后端 未结 1 868
悲&欢浪女
悲&欢浪女 2020-12-17 17:00

In the dataframe below, I would like to eliminate the duplicate cid values so the output from df.groupby(\'date\').cid.size() matches the output fr

相关标签:
1条回答
  • 2020-12-17 17:39

    You don't need groupby to drop duplicates based on a few columns, you can specify a subset instead:

    df2 = df.drop_duplicates(["date", "cid"])
    df2.groupby('date').cid.size()
    Out[99]: 
    date
    2005      3
    2006     10
    2007    227
    2008     52
    2009    142
    2010     57
    2011    219
    2012     99
    2013    238
    2014    146
    dtype: int64
    
    0 讨论(0)
提交回复
热议问题