Create Excel-like SUMIFS in Pandas

后端 未结 2 615
南旧
南旧 2021-01-13 00:37

I recently learned about pandas and was happy to see its analytics functionality. I am trying to convert Excel array functions into the Pandas equivalent to au

相关标签:
2条回答
  • 2021-01-13 00:54

    I'm sure there is a better way, but this did it in a loop:

    for idx, eachRecord in reportAggregateDF.T.iteritems():
    reportAggregateDF['PORT_WEIGHT'].ix[idx] = reportAggregateDF['SEC_WEIGHT_RATE'][(reportAggregateDF['PORT_ID'] == portID) &            
        (reportAggregateDF['SEC_ID'] == 0) &            
        (reportAggregateDF['GROUP_LIST'] == " ") &             
        (reportAggregateDF['START_DATE'] == reportAggregateDF['START_DATE'].ix[idx]) &             
        (reportAggregateDF['END_DATE'] == reportAggregateDF['END_DATE'].ix[idx])].sum()
    
    0 讨论(0)
  • 2021-01-13 00:55

    You want to use the apply function and a lambda:

    >> df
         A    B    C    D     E
    0  mitfx  0  200  300  0.25
    1     gs  1  150  320  0.35
    2    duk  1    5    2  0.45
    3    bmo  1  145   65  0.65
    

    Let's say I want to sum column C times E but only if column B == 1 and D is greater than 5:

    df['matches'] = df.apply(lambda x: x['C'] * x['E'] if x['B'] == 1 and x['D'] > 5 else 0, axis=1)
    df.matches.sum()
    

    It might be cleaner to split this into two steps:

    df_subset = df[(df.B == 1) & (df.D > 5)]
    df_subset.apply(lambda x: x.C * x.E, axis=1).sum()
    

    or to use simply multiplication for speed:

    df_subset = df[(df.B == 1) & (df.D > 5)]
    print sum(df_subset.C * df_subset.E)
    

    You are absolutely right to want to do this problem without loops.

    0 讨论(0)
提交回复
热议问题