rolling-computation | 易学教程

Pandas rolling sum for multiply values separately

阅读更多关于 Pandas rolling sum for multiply values separately

问题 I have the following dataframe: a = pd.DataFrame({'unit': [2, 2, 3, 3, 3, 4, 4, 4, 5], 'date': [1, 2, 1, 2, 3, 1, 2, 3, 1], 'revenue': [1, 1, 3, 5, 7, 6, 6, 2, 9]}) Pandas rolling.sum with window = 2: a['rolled_sum'] = a.rolling(2, on='date').sum().shift(+1)['revenue'] computes this sum row by row: adunit date revenue rolled_sum 0 2 1 1 NaN 1 2 2 1 NaN 2 3 1 3 2.0 3 3 2 5 4.0 4 3 3 7 8.0 5 4 1 6 12.0 6 4 2 6 13.0 7 4 3 2 12.0 8 5 1 9 8.0 I would like to have this rolling sum computed for each

R: create a data frame out of a rolling window

阅读更多关于 R: create a data frame out of a rolling window

问题 Lets say I have a data frame with the following structure: DF <- data.frame(x = 0:4, y = 5:9) > DF x y 1 0 5 2 1 6 3 2 7 4 3 8 5 4 9 what is the most efficient way to turn 'DF' into a data frame with the following structure: w x y 1 0 5 1 1 6 2 1 6 2 2 7 3 2 7 3 3 8 4 3 8 4 4 9 Where w is a length 2 window rolling through the dataframe 'DF.' The length of the window should be arbitrary, i.e a length of 3 yields w x y 1 0 5 1 1 6 1 2 7 2 1 6 2 2 7 2 3 8 3 2 7 3 3 8 3 4 9 I am a bit stumped by

Pandas sum over a date range for each category separately

阅读更多关于 Pandas sum over a date range for each category separately

问题 I have a dataframe with timeseries of sales transactions for different items: import pandas as pd from datetime import timedelta df_1 = pd.DataFrame() df_2 = pd.DataFrame() df_3 = pd.DataFrame() # Create datetimes and data df_1['date'] = pd.date_range('1/1/2018', periods=5, freq='D') df_1['item'] = 1 df_1['sales']= 2 df_2['date'] = pd.date_range('1/1/2018', periods=5, freq='D') df_2['item'] = 2 df_2['sales']= 3 df_3['date'] = pd.date_range('1/1/2018', periods=5, freq='D') df_3['item'] = 3 df

Create a ROLLING sum over a period of time in mysql

阅读更多关于 Create a ROLLING sum over a period of time in mysql

问题 I have a table with columns date and time_spent . I want to find for each date D the sum of the values of 'time_spent' for the period of time : (D-7 - D), ie. past week + current day. I can't figure out a way to do this, as I can only find examples for a total sum and not a sum over a variable period of time. Here is a dataset example : CREATE TABLE rolling_total ( date date, time_spent int ); INSERT INTO rolling_total VALUES ('2013-09-01','2'), ('2013-09-02','1'), ('2013-09-03','3'), ('2013

Apply sum product on columns of a dataframe in rolling windows

阅读更多关于 Apply sum product on columns of a dataframe in rolling windows

问题 I have a set of defined weights and I want to calculate the weighted sum of returns in rolling windows on a time series dataframe. I believe we would use rollapplyr here, but I am unsure how to perform rolling window function across each row of the dataframe. Find below dput output of a sample of the data: tempVar <- structure(c(NA, -0.0081833512947922, 0.00508150903899551, -0.0072202479734873, 0.00345258369231161, NA, 0, -0.00847462699097257, -0.00794638265247283, 0.00445091892889238, NA, NA

Speeding up rolling sum calculation in pandas groupby

阅读更多关于 Speeding up rolling sum calculation in pandas groupby

问题 I want to compute rolling sums group-wise for a large number of groups and I'm having trouble doing it acceptably quickly. Pandas has build-in methods for rolling and expanding calculations Here's an example: import pandas as pd import numpy as np obs_per_g = 20 g = 10000 obs = g * obs_per_g k = 20 df = pd.DataFrame( data=np.random.normal(size=obs * k).reshape(obs, k), index=pd.MultiIndex.from_product(iterables=[range(g), range(obs_per_g)]), ) To get rolling and expanding sums I can use df

Rolling percentage add along column

阅读更多关于 Rolling percentage add along column

问题 I feel this should be easy in base R but I just can't figure it out. I have a simple dataframe, let's say it looks like this tbl <- read.table(text = "Field1 Field2 100 200 150 180 200 160 280 250 300 300 300 250", header = TRUE) Now, what I want to do is create a function that will apply a rolling % addition, something like: fn <- function(tbl, pct) {} which accepts the dataframe above as tbl . It adds a percentage fraction of the current row to the NEXT row down based on pct , and rolls

r calculating rolling average with window based on value (not number of rows or date/time variable)

阅读更多关于 r calculating rolling average with window based on value (not number of rows or date/time variable)

问题 I'm quite new to all the packages meant for calculating rolling averages in R and I hope you can show me in the right direction. I have the following data as an example: ms <- c(300, 300, 300, 301, 303, 305, 305, 306, 308, 310, 310, 311, 312, 314, 315, 315, 316, 316, 316, 317, 318, 320, 320, 321, 322, 324, 328, 329, 330, 330, 330, 332, 332, 334, 334, 335, 335, 336, 336, 337, 338, 338, 338, 340, 340, 341, 342, 342, 342, 342) correct <- c(1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0,

r calculating rolling average with window based on value (not number of rows or date/time variable)

阅读更多关于 r calculating rolling average with window based on value (not number of rows or date/time variable)

count number of times a factor appears during rolling window

阅读更多关于 count number of times a factor appears during rolling window

问题 I want to generate the column: "PriorityCountInLast7Days". For a given employee A, this column counts the number of CASES in the last 7 days where PRIORITY is the same as the current case. How would I do that in R with the first 4 columns? data <- data.frame( Date = c("2018-06-01", "2018-06-03", "2018-06-03", "2018-06-03", "2018-06-04", "2018-06-01", "2018-06-02", "2018-06-03"), Emp1 = c("A","A","A","A","A","A","B","B","B"), Case = c("A1", "A2", "A3", "A4", "A5", "A6", "B1", "B2", "B3"),