cumsum | 易学教程

Multiple cumulative sum within a numpy array

阅读更多关于 Multiple cumulative sum within a numpy array

问题 I'm sort of newbie in numpy so I'm sorry if this question was already asked. I'm looking for a vectorization solution which enable to run multiple cumsum of different size within a one dimension numpy array. my_vector=np.array([1,2,3,4,5]) size_of_groups=np.array([3,2]) I would like something like np.cumsum.group(my_vector,size_of_groups) [1,3,6,4,9] I do not want a solution with loops. Either numpy functions or numpy operations. 回答1: Not sure about numpy, but pandas can do this pretty easily

Conditional Cumulative Sum in R

阅读更多关于 Conditional Cumulative Sum in R

I have a time series data frame and want to compute cumulative returns for stock symbols intra-day for a range of dates. When the symbol and/or date changes the cumulative return should reset. Any help would be appreciated. A small sample of my data frame is below including what the cumulative sum column should return. Thanks. Date Symbol Time Last Return Cumulative.Sum 1 1/2/2013 AA 9:30 42.00 n/a n/a 2 1/2/2013 AA 12:00 42.50 1.19% 1.19% 3 1/2/2013 AA 16:00 42.88 0.89% 2.08% 4 1/2/2013 AAPL 9:30 387.00 n/a n/a 5 1/2/2013 AAPL 12:00 387.87 0.22% 0.22% 6 1/2/2013 AAPL 16:00 388.69 0.21% 0.44%

Conditional Cumulative Sum in R

阅读更多关于 Conditional Cumulative Sum in R

问题 I have a time series data frame and want to compute cumulative returns for stock symbols intra-day for a range of dates. When the symbol and/or date changes the cumulative return should reset. Any help would be appreciated. A small sample of my data frame is below including what the cumulative sum column should return. Thanks. Date Symbol Time Last Return Cumulative.Sum 1 1/2/2013 AA 9:30 42.00 n/a n/a 2 1/2/2013 AA 12:00 42.50 1.19% 1.19% 3 1/2/2013 AA 16:00 42.88 0.89% 2.08% 4 1/2/2013 AAPL

R: cumulative sum over rolling date range

阅读更多关于 R: cumulative sum over rolling date range

In R, how can I calculate cumsum for a defined time period prior to the row being calculate? Prefer dplyr if possible. For example, if the period was 10 days, then the function would achieve cum_rolling10: date value cumsum cum_rolling10 1/01/2000 9 9 9 2/01/2000 1 10 10 5/01/2000 9 19 19 6/01/2000 3 22 22 7/01/2000 4 26 26 8/01/2000 3 29 29 13/01/2000 10 39 29 14/01/2000 9 48 38 18/01/2000 2 50 21 19/01/2000 9 59 30 21/01/2000 8 67 38 25/01/2000 5 72 24 26/01/2000 1 73 25 30/01/2000 6 79 20 31/01/2000 6 85 18 A solution using dplyr , tidyr , lubridate , and zoo . library(dplyr) library(tidyr)

Python pandas cumsum with reset everytime there is a 0 [duplicate]

阅读更多关于 Python pandas cumsum with reset everytime there is a 0 [duplicate]

This question already has an answer here: Cumsum reset at NaN 4 answers I have a matrix with 0s and 1s, and want to do a cumsum on each column that resets to 0 whenever a zero is observed. For example, if we have the following: df = pd.DataFrame([[0,1],[1,1],[0,1],[1,0],[1,1],[0,1]],columns = ['a','b']) print(df) a b 0 0 1 1 1 1 2 0 1 3 1 0 4 1 1 5 0 1 The result I desire is: print(df) a b 0 0 1 1 1 2 2 0 3 3 1 0 4 2 1 5 0 2 However, when I try df.cumsum() * df , I am able to correctly identify the 0 elements, but the counter does not reset: print(df.cumsum() * df) a b 0 0 1 1 1 2 2 0 3 3 2 0

Cumulative sum with lag

阅读更多关于 Cumulative sum with lag

I have a very large dataset that looks simplified like this: row. member_id entry_id comment_count timestamp 1 1 a 4 2008-06-09 12:41:00 2 1 b 1 2008-07-14 18:41:00 3 1 c 3 2008-07-17 15:40:00 4 2 d 12 2008-06-09 12:41:00 5 2 e 50 2008-09-18 10:22:00 6 3 f 0 2008-10-03 13:36:00 I can aggregate the counts with the following code: transform(df, aggregated_count = ave(comment_count, member_id, FUN = cumsum)) But I want a lag of 1 in the cumulated data, or I want cumsum to ignore the current row. The result should be: row. member_id entry_id comment_count timestamp previous_comments 1 1 a 4 2008

How can I use cumsum within a group in Pandas?

阅读更多关于 How can I use cumsum within a group in Pandas?

I have df = pd.DataFrame.from_dict({'id': ['A', 'B', 'A', 'C', 'D', 'B', 'C'], 'val': [1,2,-3,1,5,6,-2], 'stuff':['12','23232','13','1234','3235','3236','732323']}) id stuff val 0 A 12 1 1 B 23232 2 2 A 13 -3 3 C 1234 1 4 D 3235 5 5 B 3236 6 6 C 732323 -2 I'd like to get running some of val for each id , so the desired output looks like this: id stuff val cumsum 0 A 12 1 1 1 B 23232 2 2 2 A 13 -3 -2 3 C 1234 1 1 4 D 3235 5 5 5 B 3236 6 8 6 C 732323 -2 -1 This is what I tried: df['cumsum'] = df.groupby('id').cumsum(['val']) and df['cumsum'] = df.groupby('id').cumsum(['val']) This is the error I

Pandas dataframe - running sum with reset

阅读更多关于 Pandas dataframe - running sum with reset

I want to calculate the running sum in a given column(without using loops, of course). The caveat is that I have this other column that specifies when to reset the running sum to the value present in that row. Best explained by the following example: reset val desired_col 0 0 1 1 1 0 5 6 2 0 4 10 3 1 2 2 4 1 -1 -1 5 0 6 5 6 0 4 9 7 1 2 2 desired_col is the value I want to be calculated. You can use 2 times cumsum() : # reset val desired_col #0 0 1 1 #1 0 5 6 #2 0 4 10 #3 1 2 2 #4 1 -1 -1 #5 0 6 5 #6 0 4 9 #7 1 2 2 df['cumsum'] = df['reset'].cumsum() #cumulative sums of groups to column des df[

Conditional cumsum with reset

阅读更多关于 Conditional cumsum with reset

I have a data frame, the data frame is already sorted as needed, but now I will like to "slice it" in groups. This groups should have a max cumulative value of 10. When the cumulative value is > 10, it should reset the cumulative sum and start over again library(dplyr) id <- sample(1:15) order <- 1:15 value <- c(4, 5, 7, 3, 8, 1, 2, 5, 3, 6, 2, 6, 3, 1, 4) df <- data.frame(id, order, value) df This is the output I'm looking for(I did it "manually") cumsum_10 <- c(4, 9, 7, 10, 8, 9, 2, 7, 10, 6, 8, 6, 9, 10, 4) group_10 <- c(1, 1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6, 7) df1 <- data.frame(df,

Pandas group by cumsum keep columns

阅读更多关于 Pandas group by cumsum keep columns

问题 I have spent a few hours now trying to do a "cumulative group by sum" on a pandas dataframe. I have looked at all the stackoverflow answers and surprisingly none of them can solve my (very elementary) problem: I have a dataframe: df1 Out[8]: Name Date Amount 0 Jack 2016-01-31 10 1 Jack 2016-02-29 5 2 Jack 2016-02-29 8 3 Jill 2016-01-31 10 4 Jill 2016-02-29 5 I am trying to group by ['Name','Date'] and cumsum 'Amount'. That is it. So the desired output is: df1 Out[10]: Name Date Cumsum 0 Jack