Take this simple dataset and function (representative of more complex problems):
x <- data.frame(a = 1:3, b = 2:4)
mult <- function(a,b,n) (a + b) * n
The best approach I've found (which is still not terribly elegant) is to pipe into bind_cols
. To get pmap_dfr
to work correctly, the function should return a named list (which may or may not be a data frame):
library(tidyverse)
x <- data.frame(a = 1:3, b = 2:4)
mult <- function(a,b,n) as.list(set_names((a + b) * n, paste0('new', n)))
x %>% bind_cols(pmap_dfr(., mult, n = 1:2))
#> a b new1 new2
#> 1 1 2 3 6
#> 2 2 3 5 10
#> 3 3 4 7 14
To avoid changing the definition of mult
, you can wrap it in an anonymous function:
mult <- function(a,b,n) (a + b) * n
x %>% bind_cols(pmap_dfr(
.,
~as.list(set_names(
mult(...),
paste0('new', 1:2)
)),
n = 1:2
))
#> a b new1 new2
#> 1 1 2 3 6
#> 2 2 3 5 10
#> 3 3 4 7 14
In this particular case, it's not actually necessary to iterate over rows, though, because you can vectorize the inputs from x
and instead iterate over n
. The advantage is that usually n > p, so the number of iterations will be [potentially much] lower. To be clear, whether such an approach is possible depends on for which parameters the function can accept vector arguments.
mult
still needs to be called on the variables of x
. The simplest way to do this is to pass them explicitly:
x %>% bind_cols(map_dfc(1:2, ~mult(x$a, x$b, .x)))
#> a b V1 V2
#> 1 1 2 3 6
#> 2 2 3 5 10
#> 3 3 4 7 14
...but this loses the benefit of pmap
that named variables will automatically get passed to the correct parameter. You can get that back by using purrr::lift
, which is an adverb that changes the domain of a function so it accepts a list by wrapping it in do.call
. The returned function can be called on x
and the value of n
for that iteration:
x %>% bind_cols(map_dfc(1:2, ~lift(mult)(x, n = .x)))
This is equivalent to
x %>% bind_cols(map_dfc(1:2, ~invoke(mult, x, n = .x)))
but the advantage of the former is that it returns a function which can be partial
ly applied on x
so it only has an n
parameter left, and thus requires no explicit references to x
and so pipes better:
x %>% bind_cols(map_dfc(1:2, partial(lift(mult), .)))
All return the same thing. Names can be fixed after the fact with %>% set_names(~sub('^V(\\d+)$', 'new\\1', .x))
, if you like.