Pandas group by operations on a data frame

前端 未结 1 1792
鱼传尺愫
鱼传尺愫 2020-12-10 15:41

I have a pandas data frame like the one below.

UsrId   JobNos
 1       4
 1       56
 2       23 
 2       55
 2       41
 2       5
 3       78
 1       25         


        
相关标签:
1条回答
  • 2020-12-10 16:46

    Something like df.groupby('UsrId').JobNos.sum().idxmax() should do it:

    In [1]: import pandas as pd
    
    In [2]: from StringIO import StringIO
    
    In [3]: data = """UsrId   JobNos
       ...:  1       4
       ...:  1       56
       ...:  2       23 
       ...:  2       55
       ...:  2       41
       ...:  2       5
       ...:  3       78
       ...:  1       25
       ...:  3       1"""
    
    In [4]: df = pd.read_csv(StringIO(data), sep='\s+')
    
    In [5]: grouped = df.groupby('UsrId')
    
    In [6]: grouped.JobNos.sum()
    Out[6]: 
    UsrId
    1         85
    2        124
    3         79
    Name: JobNos
    
    In [7]: grouped.JobNos.sum().idxmax()
    Out[7]: 2
    

    If you want your results based on the number of items in each group:

    In [8]: grouped.size()
    Out[8]: 
    UsrId
    1        3
    2        4
    3        2
    
    In [9]: grouped.size().idxmax()
    Out[9]: 2
    

    Update: To get ordered results you can use the .order method:

    In [10]: grouped.JobNos.sum().order(ascending=False)
    Out[10]: 
    UsrId
    2        124
    1         85
    3         79
    Name: JobNos
    
    0 讨论(0)
提交回复
热议问题