Use previous calculated row value in r Continued

问题

I have a data.table that looks like this:

DT <- data.table(A=1:20, B=1:20*10, C=1:20*100)
DT
    A  B   C
1:  1  10  100
2:  2  20  200
3:  3  30  300
4:  4  40  400
5:  5  50  500
...
20: 20 200 2000

I want to be able to calculate a new column "G" that has the first value as the average of the first 20 rows in column B as the first value, and then I want to use the first row of column G to help calculate the next row value of G.

Say the Average of the first 20 rows of column B is 105, and the formula for the next row in G is: DT$G[2] = DT$G[1]*2, and the next row again is DT$G[3]=DT$G[2]*2. This means that the first value should not be used again in the next row and so forth.

    A    B   C       G
1:  1   10   100     105
2:  2   20   200     210
3:  3   30   300     420
4:  4   40   400     840
5:  5   50   500     1680
...
20: 20  200  2000    55050240

Any ideas on this would be made?

回答1:

You can do this with a little arithmetic:

DT$G <- mean(DT$B[1:20])
DT$G <- DT$G * cumprod(rep(2,nrow(DT)))/2

Or using data.table syntax, courtesy of @DavidArenburg:

DT[ , G := mean(B[1:20]) * cumprod(rep(2, .N)) / 2]

or from @Frank

DT$G <- cumprod(c( mean(head(DT$B,20)), rep(2,nrow(DT)-1) ))

回答2:

mycalc <- function(x, n) {
  y <- numeric(n)
  y[1] <- mean(x)
  for (i in 2:n) y[i] <- 2*y[i-1]
  y
}
DT[ , G := mycalc(B[1:20], .N)]

来源：https://stackoverflow.com/questions/33394070/use-previous-calculated-row-value-in-r-continued

标签

data.table

lag

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!