Cumulative Sum using 2 columns

后端 未结 2 1222
盖世英雄少女心
盖世英雄少女心 2021-01-15 22:59

I am trying to create a column that does a cumulative sum using 2 columns , please see example of what I am trying to do :@Faith Akici

  index lodgement_yea         


        
2条回答
  •  鱼传尺愫
    2021-01-16 00:00

    If we only need to consider the column 'words', we might need to loop through unique values of the words

    for unique_words in df_2.words.unique():
        if 'cum_sum' not in df_2:
            df_2['cum_sum'] = df_2.loc[df_2['words'] == unique_words]['sum'].cumsum()
        else:
            df_2.update(pd.DataFrame({'cum_sum': df_2.loc[df_2['words'] == unique_words]['sum'].cumsum()}))
    

    above will result to:

    >>> print(df_2)
      lodgement_year  sum      words  cum_sum
    0           2000   14        the     14.0
    1           2000   10  australia     10.0
    2           2000   12       word     12.0
    3           2000    8      brand      8.0
    4           2000    5      fresh      5.0
    5           2001    8        the     22.0
    6           2001    3  australia     13.0
    7           2001    1     banana      1.0
    8           2001    7      brand     15.0
    9           2001    1      fresh      6.0
    

提交回复
热议问题