cumsum

resetting cumsum if value goes to negative in r

假如想象 提交于 2019-11-28 01:59:19
问题 ve <- c(17, -9, 9, -17, 17, -17, 11, -9, 16, -18, 17, 0, 0, -18, 17, 0, 0, -17, 14, -14, 17, -2, 0, -15, 9, -9, 17, -16, 16, -17, 17, -17, 17, -17, 17, -17, 17, -8, 7, -16, 17, -14, 14, -10, 10, -16, 16, -10, 10, -12, 12, -11, 11, -17, 17, -17, 17, -9, 8, -17, 17, -17, 17, -16, 16, -17, 17, -8, 8, -9, 9, -17, 17, -17, 17, -13, 13, -10, 7, -10, 13, -16, 17, -13, 13, -13, 13, -9, 8, -17, 17, -10, 9, -17, 17, -17, 17, -16, 16, -10, 10, -15, 15, -14, 14, -14, 15, -13, 13, -9, 9, -13, 13, -12, 12,

Cumulative sum with lag

谁说我不能喝 提交于 2019-11-28 00:21:28
问题 I have a very large dataset that looks simplified like this: row. member_id entry_id comment_count timestamp 1 1 a 4 2008-06-09 12:41:00 2 1 b 1 2008-07-14 18:41:00 3 1 c 3 2008-07-17 15:40:00 4 2 d 12 2008-06-09 12:41:00 5 2 e 50 2008-09-18 10:22:00 6 3 f 0 2008-10-03 13:36:00 I can aggregate the counts with the following code: transform(df, aggregated_count = ave(comment_count, member_id, FUN = cumsum)) But I want a lag of 1 in the cumulated data, or I want cumsum to ignore the current row.

How to compute cumulative sum of previous N rows in pandas?

浪子不回头ぞ 提交于 2019-11-27 18:04:50
问题 I am working with pandas, but I don't have so much experience. I have the following DataFrame: A 0 NaN 1 0.00 2 0.00 3 3.33 4 10.21 5 6.67 6 7.00 7 8.27 8 6.07 9 2.17 10 3.38 11 2.48 12 2.08 13 6.95 14 0.00 15 1.75 16 6.66 17 9.69 18 6.73 19 6.20 20 3.01 21 0.32 22 0.52 and I need to compute the cumulative sum of the previous 11 rows. When there is less than 11 previously, they remaining are assumed to be 0. B 0 NaN 1 0.00 2 0.00 3 0.00 4 3.33 5 13.54 6 20.21 7 27.20 8 35.47 9 41.54 10 43.72

Cumsum as a new column in an existing Pandas data

微笑、不失礼 提交于 2019-11-27 16:09:47
I have a pandas dataframe defined as: A B SUM_C 1 1 10 1 2 20 I would like to do a cumulative sum of SUM_C and add it as a new column to the same dataframe. In other words, my end goal is to have a dataframe that looks like below: A B SUM_C CUMSUM_C 1 1 10 10 1 2 20 30 Using cumsum in pandas on group() shows the possibility of generating a new dataframe where column name SUM_C is replaced with cumulative sum. However, my ask is to add the cumulative sum as a new column to the existing dataframe. Thank you blacksite Just apply cumsum on the pandas.Series df['SUM_C'] and assign it to a new

generalized cumulative functions in NumPy/SciPy?

随声附和 提交于 2019-11-27 08:39:12
Is there a function in numpy or scipy (or some other library) that generalizes the idea of cumsum and cumprod to arbitrary function. For example, consider the (theoretical) function cumf( func, array) func is a function that accepts two floats, and returns a float. Particular cases lambda x,y: x+y and lambda x,y: x*y are cumsum and cumprod respectively. For example, if func = lambda x,prev_x: x^2*prev_x and I apply it to: cumf(func, np.array( 1, 2, 3) ) I would like np.array( 1, 4, 9*4 ) NumPy's ufuncs have accumulate() : In [22]: np.multiply.accumulate([[1, 2, 3], [4, 5, 6]], axis=1) Out[22]:

Cumsum as a new column in an existing Pandas data

做~自己de王妃 提交于 2019-11-27 06:54:49
问题 I have a pandas dataframe defined as: A B SUM_C 1 1 10 1 2 20 I would like to do a cumulative sum of SUM_C and add it as a new column to the same dataframe. In other words, my end goal is to have a dataframe that looks like below: A B SUM_C CUMSUM_C 1 1 10 10 1 2 20 30 Using cumsum in pandas on group() shows the possibility of generating a new dataframe where column name SUM_C is replaced with cumulative sum. However, my ask is to add the cumulative sum as a new column to the existing

Cumsum reset at NaN

徘徊边缘 提交于 2019-11-27 01:36:53
If I have a pandas.core.series.Series named ts of either 1's or NaN's like this: 3382 NaN 3381 NaN ... 3369 NaN 3368 NaN ... 15 1 10 NaN 11 1 12 1 13 1 9 NaN 8 NaN 7 NaN 6 NaN 3 NaN 4 1 5 1 2 NaN 1 NaN 0 NaN I would like to calculate cumsum of this serie but it should be reset (set to zero) at the location of the NaNs like below: 3382 0 3381 0 ... 3369 0 3368 0 ... 15 1 10 0 11 1 12 2 13 3 9 0 8 0 7 0 6 0 3 0 4 1 5 2 2 0 1 0 0 0 Ideally I would like to have a vectorized solution ! I ever see a similar question with Matlab : Matlab cumsum reset at NaN? but I don't know how to translate this

Cumulative sum until maximum reached, then repeat from zero in the next row

蹲街弑〆低调 提交于 2019-11-26 22:53:34
I feel like this is a fairly easy question, but for the life of me I can't seem to find the answer. I have a fairly standard dataframe, and what I am trying to do is sum the a column of values until they reach some value (either that exact value or greater than it), at which point it drops a 1 into a new column (labelled keep) and restarts the summing at 0. I have a column of minutes, the differences between the minutes, a keep column, and a cumulative sum column (the example I am using is much cleaner than the actual full dataset) minutes difference keep difference_sum 1052991158 0 0 0

How can I use cumsum within a group in Pandas?

泄露秘密 提交于 2019-11-26 20:52:04
问题 I have df = pd.DataFrame.from_dict({'id': ['A', 'B', 'A', 'C', 'D', 'B', 'C'], 'val': [1,2,-3,1,5,6,-2], 'stuff':['12','23232','13','1234','3235','3236','732323']}) id stuff val 0 A 12 1 1 B 23232 2 2 A 13 -3 3 C 1234 1 4 D 3235 5 5 B 3236 6 6 C 732323 -2 I'd like to get running some of val for each id , so the desired output looks like this: id stuff val cumsum 0 A 12 1 1 1 B 23232 2 2 2 A 13 -3 -2 3 C 1234 1 1 4 D 3235 5 5 5 B 3236 6 8 6 C 732323 -2 -1 This is what I tried: df['cumsum'] =

Calculate cumulative sum within each ID (group)

烂漫一生 提交于 2019-11-26 14:41:18
With data frame: df <- data.frame(id = rep(1:3, each = 5) , hour = rep(1:5, 3) , value = sample(1:15)) I want to add a cumulative sum column that matches the id : df id hour value csum 1 1 1 7 7 2 1 2 9 16 3 1 3 15 31 4 1 4 11 42 5 1 5 14 56 6 2 1 10 10 7 2 2 2 12 8 2 3 5 17 9 2 4 6 23 10 2 5 4 27 11 3 1 1 1 12 3 2 13 14 13 3 3 8 22 14 3 4 3 25 15 3 5 12 37 How can I do this efficiently? Thanks! df$csum <- ave(df$value, df$id, FUN=cumsum) To add to the alternatives, data.table 's syntax is nice: library(data.table) DT <- data.table(df, key = "id") DT[, csum := cumsum(value), by = key(DT)] Or,