I have the following data frame df
:
v1 v2 v3 v4
1 1 5 7 4
2 2 6 10 3
And I want to obtain the following data frame
You are really close.
df2 <-
df %>%
mutate(v1v3 = v1 * v3,
v2v4 = v2 * v4)
such a beautifully simple language, right?
For more great tricks please see here.
EDIT: Thanks to @Facottons pointer to this answer: https://stackoverflow.com/a/34377242/5088194, here is a tidy approach to resolving this issue. It keeps one from having to write a line to hard code in each new column desired. While it is a bit more verbose than the Base R approach, the logic is at least more immediately transparent/readable. It is also worth noting that there must be at least half as many rows as there are columns for this approach to work.
# prep the product column names (also acting as row numbers)
df <-
df %>%
mutate(prod_grp = paste0("v", row_number(), "v", row_number() + 2))
# converting data to tidy format and pairing columns to be multiplied together.
tidy_df <-
df %>%
gather(column, value, -prod_grp) %>%
mutate(column = as.numeric(sub("v", "", column)),
pair = column - 2) %>%
mutate(pair = if_else(pair < 1, pair + 2, pair))
# summarize the products for each column
prod_df <-
tidy_df %>%
group_by(prod_grp, pair) %>%
summarize(val = prod(value)) %>%
spread(prod_grp, val) %>%
mutate(pair = paste0("v", pair, "v", pair + 2)) %>%
rename(prod_grp = pair)
# put the original frame and summary frames together
final_df <-
df %>%
left_join(prod_df) %>%
select(-prod_grp)
I think I found a solution:
df %>%
mutate(n = df[1:(ncol(df)/2)] * df[(1+ncol(df)/2):(ncol(df))]) %>% head()
The result is valid for any number of variables. It only remains a problem with the name of the new variables. This is the result:
v1 v2 v3 v4 n.v1 n.v2
1 1 5 7 4 7 20
2 2 6 10 3 20 18
We can use base R
instead of using any extra packages like dplyr
or data.table
We can use mapply
to vectorize the operation for multiple vectors at the same time
n <- ncol(df)/2
mapply(`*`, df[1:n], df[(n + 1):ncol(df)])
# v1 v2
#[1,] 7 20
#[2,] 20 18
We can merge (cbind
) this dataframe to your original one then.
If you are interested in tidyverse
solution the equivalent in purrr
would be variants of map2
purrr::map2_df(df[1:n], df[(n + 1):ncol(df)], `*`)
# A tibble: 2 x 2
# v1 v2
# <dbl> <dbl>
#1 7 20
#2 20 18
Just use mutate as is with a comma to separate new columns mutate(df,"v1v3"=v1*v3,"v2v4"= v2*v4)