问题
I have a dataframe df with 6 fields A,B,C,D,E & F. My requirement is to create a new column G which is equal to the previous value(C) + previous value(D) + previous (G) - F. But this needs to be implemented at a group level through columns A & B (group by A & B). In case it is the first row within the group then the value in column G should be equal to E.
Sample Df -
A B C D E F
1 2 100 200 300 0
1 2 110 210 310 10
1 2 120 130 300 10
1 1 140 150 80 0
1 1 50 60 80 20
1 1 50 60 80 20
Output -
A B C D E F G
1 2 100 200 300 0 300
1 2 110 210 310 10 590
1 2 120 130 300 10 900
1 1 140 150 80 0 80
1 1 50 60 80 20 350
1 1 50 60 80 20 440
Please provide a suitable solution.
回答1:
Here is one option with dplyr
where we group by 'A', 'B', take the lag
of 'C', 'D', 'E' add (+
) them, and subtract from 'F', and coalesce with the 'E' column
library(dplyr)
df1 %>%
group_by(A, B) %>%
mutate(G = coalesce(lag(C) + lag(D) + lag(E) - F, E))
-output
# A tibble: 6 x 7
# Groups: A, B [2]
# A B C D E F G
# <int> <int> <int> <int> <int> <int> <int>
#1 1 2 100 200 300 0 300
#2 1 2 110 210 310 10 590
#3 1 2 120 130 300 10 620
#4 1 1 140 150 80 0 80
#5 1 1 50 60 80 20 350
#6 1 1 50 60 80 20 170
data
df1 <- structure(list(A = c(1L, 1L, 1L, 1L, 1L, 1L), B = c(2L, 2L, 2L,
1L, 1L, 1L), C = c(100L, 110L, 120L, 140L, 50L, 50L), D = c(200L,
210L, 130L, 150L, 60L, 60L), E = c(300L, 310L, 300L, 80L, 80L,
80L), F = c(0L, 10L, 10L, 0L, 20L, 20L)), class = "data.frame",
row.names = c(NA,
-6L))
来源:https://stackoverflow.com/questions/65115435/iteration-through-rows-of-a-dataframe-within-group-of-columns-in-r