I have to following issue using R. In short I want to create multiple new columns in a data frame based on calculations of different column pairs in the data frame.
in base R, all vectorized:
nms <- names(df)
df[paste0("sum_",unique(gsub("[1-9]","",nms)))] <-
df[endsWith(nms,"1")] + df[endsWith(nms,"2")]
# a1 b1 c1 a2 b2 c2 sum_a sum_b sum_c
# 1 1 4 10 9 3 15 10 7 25
# 2 2 5 11 10 4 16 12 9 27
# 3 3 6 12 11 5 17 14 11 29
# 4 4 7 13 12 6 18 16 13 31
# 5 5 8 14 13 7 19 18 15 33
df %>%
mutate(sum_a = pmap_dbl(select(., starts_with("a")), sum),
sum_b = pmap_dbl(select(., starts_with("b")), sum),
sum_c = pmap_dbl(select(., starts_with("c")), sum))
a1 b1 c1 a2 b2 c2 sum_a sum_b sum_c
1 1 4 10 9 3 15 10 7 25
2 2 5 11 10 4 16 12 9 27
3 3 6 12 11 5 17 14 11 29
4 4 7 13 12 6 18 16 13 31
5 5 8 14 13 7 19 18 15 33
EDIT:
In the case there are many columns, and you wish to apply it programmatically:
row_sums <- function(x) {
transmute(df, !! paste0("sum_", quo_name(x)) := pmap_dbl(select(df, starts_with(x)), sum))
}
newdf <- map_dfc(letters[1:3], row_sums)
newdf
sum_a sum_b sum_c
1 10 7 25
2 12 9 27
3 14 11 29
4 16 13 31
5 18 15 33
And if needed you can tack on the original variables with:
bind_cols(df, dfnew)
a1 b1 c1 a2 b2 c2 sum_a sum_b sum_c
1 1 4 10 9 3 15 10 7 25
2 2 5 11 10 4 16 12 9 27
3 3 6 12 11 5 17 14 11 29
4 4 7 13 12 6 18 16 13 31
5 5 8 14 13 7 19 18 15 33