Cumsum excluding current value

蹲街弑〆低调 提交于 2019-12-24 20:03:00

问题


I am new to R and I am trying to write a function to cumulatively sum previously ordered items by customers. I have already found an almost-fitting example of code on Stack Overflow, but I do not manage to modify it accordingly to my needs.

This is the code:

Fruits <- Fruits[order(Cars$order.id), ]  #sort data
Fruits$prev_Apples<-with(Fruits, 
    ave(
        ave(Apples, customer.id, FUN=cumsum),  #get running sum per customer.id
        interaction(customer.id, order.id, drop=T), 
    FUN=max, na.rm=T) #find largest sum per index per seg
)

And this is the Fruits data.frame:

order.id   customer.id	Apples	 Peaches  Pears
1001	   J Car Ltd    	1   	0   	0
1002	    Som Comp    	0   	2   	0
1005	   Richardson   	0   	0   	1
1004	   J Car Ltd    	1   	0   	0
1003	   J Car Ltd    	2   	0   	0
1006	   Richardson   	1   	0   	1
1007	    Aldridge    	0   	0   	1
1008	   J Car Ltd    	0   	0   	1
1010	    Som Comp    	0   	1   	0
1009	   J Car Ltd	    1	    0	    0

This is what I would like to obtain:

order id	customer id	Apples	Peaches	Pears	Prev_Apples
1001	J Car Ltd	1	0	0	0
1002	Som Comp	0	2	0	0
1003	J Car Ltd	2	0	0	1
1004	J Car Ltd	1	0	0	3
1005	Richardson	0	0	1	0
1006	Richardson	1	0	1	0
1007	Aldridge	0	0	1	0
1008	J Car Ltd	0	0	1	4
1009	J Car Ltd	1	0	0	4
1010	Som Comp	0	1	0	0

And this is what I actually get:

order id	customer id	Apples	Peaches	Pears	Prev_Apples
1001	J Car Ltd	1	0	0	1
1002	Som Comp	0	2	0	0
1003	J Car Ltd	2	0	0	3
1004	J Car Ltd	1	0	0	4
1005	Richardson	0	0	1	0
1006	Richardson	1	0	1	1
1007	Aldridge	0	0	1	0
1008	J Car Ltd	0	0	1	4
1009	J Car Ltd	1	0	0	5
1010	Som Comp	0	1	0	0

So the problem is that cumsum includes also the current order of Apples, while I would like it to include only previous orders. How should I modify the code? Any answer will be highly appreciated.


回答1:


Assuming the input shown reproducibly in the Note at the end we sort Fruits fixing the erroneous reference to Cars and then use ave with cumsum subtracting the current value of Apples from cumsum cancelling the last value in the sum.

This gives the same answer as the one listed as expected in the question.

Fruits <- Fruits[order(Fruits$order.id), ]
transform(Fruits, Prev_Apples = ave(Apples, customer.id, FUN = cumsum) - Apples)

giving:

   order.id customer.id Apples Peaches Pears Prev_Apples
1      1001   J Car Ltd      1       0     0           0
2      1002    Som Comp      0       2     0           0
5      1003   J Car Ltd      2       0     0           1
4      1004   J Car Ltd      1       0     0           3
3      1005  Richardson      0       0     1           0
6      1006  Richardson      1       0     1           0
7      1007    Aldridge      0       0     1           0
8      1008   J Car Ltd      0       0     1           4
10     1009   J Car Ltd      1       0     0           4
9      1010    Som Comp      0       1     0           0

Note: The input in reproducible form is assumed to be:

Fruits <- structure(list(order.id = c(1001L, 1002L, 1005L, 1004L, 1003L, 
1006L, 1007L, 1008L, 1010L, 1009L), customer.id = structure(c(2L, 
4L, 3L, 2L, 2L, 3L, 1L, 2L, 4L, 2L), .Label = c("Aldridge", "J Car Ltd", 
"Richardson", "Som Comp"), class = "factor"), Apples = c(1L, 
0L, 0L, 1L, 2L, 1L, 0L, 0L, 0L, 1L), Peaches = c(0L, 2L, 0L, 
0L, 0L, 0L, 0L, 0L, 1L, 0L), Pears = c(0L, 0L, 1L, 0L, 0L, 1L, 
1L, 1L, 0L, 0L)), .Names = c("order.id", "customer.id", "Apples", 
"Peaches", "Pears"), class = "data.frame", row.names = c(NA, 
-10L))


来源:https://stackoverflow.com/questions/47637184/cumsum-excluding-current-value

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!