Cumulative Sum using 2 columns

后端 未结 2 1224
盖世英雄少女心
盖世英雄少女心 2021-01-15 22:59

I am trying to create a column that does a cumulative sum using 2 columns , please see example of what I am trying to do :@Faith Akici

  index lodgement_yea         


        
2条回答
  •  暖寄归人
    2021-01-15 23:58

    You are almost there, Ian!

    cumsum() method calculates the cumulative sum of a Pandas column. You are looking for that applied to the grouped words. Therefore:

    In [303]: df_2['cumsum'] = df_2.groupby(['words'])['sum'].cumsum()
    
    In [304]: df_2
    Out[304]: 
       index  lodgement_year      words  sum  cum_sum  cumsum
    0      0            2000        the   14       14      14
    1      1            2000  australia   10       10      10
    2      2            2000       word   12       12      12
    3      3            2000      brand    8        8       8
    4      4            2000      fresh    5        5       5
    5      5            2001        the    8       22      22
    6      6            2001  australia    3       13      13
    7      7            2001     banana    1        1       1
    8      8            2001      brand    7       15      15
    9      9            2001      fresh    1        6       6
    

    Please comment if this fails on your bigger data set, and we'll work on a possibly more accurate version of this.

提交回复
热议问题