My question builds on a similar one by imposing an additional constraint that the name of each variable should appear only once.
Consider a data frame
I propose 2 tricks to solve this issue, see the code and some details for both solutions at the bottom :
A function .at
that returns results for for groups of variables (here only one variable by group) that we can then unsplice, so we benefit from both worlds, summarize
and summarize_at
:
df %>% summarize(
!!!.at(vars(potentially_long_name_i_dont_want_to_type_twice), mean),
!!!.at(vars(another_annoyingly_long_name), sum))
# # A tibble: 1 x 2
# potentially_long_name_i_dont_want_to_type_twice another_annoyingly_long_name
#
# 1 5.5 255
An adverb to summarize
, with a dollar notation shorthand.
df %>%
..flx$summarize(potentially_long_name_i_dont_want_to_type_twice = ~mean(.),
another_annoyingly_long_name = ~sum(.))
# # A tibble: 1 x 2
# potentially_long_name_i_dont_want_to_type_twice another_annoyingly_long_name
#
# 1 5.5 255
code for .at
It has to be used in a pipe because it uses the .
in the parent environment, messy but it works.
.at <- function(.vars, .funs, ...) {
in_a_piped_fun <- exists(".",parent.frame()) &&
length(ls(envir=parent.frame(), all.names = TRUE)) == 1
if (!in_a_piped_fun)
stop(".at() must be called as an argument to a piped function")
.tbl <- try(eval.parent(quote(.)))
dplyr:::manip_at(
.tbl, .vars, .funs, rlang::enquo(.funs), rlang:::caller_env(),
.include_group_vars = TRUE, ...)
}
I designed it to combine summarize
and summarize_at
:
df %>% summarize(
!!!.at(vars(potentially_long_name_i_dont_want_to_type_twice), list(foo=min, bar = max)),
!!!.at(vars(another_annoyingly_long_name), median))
# # A tibble: 1 x 3
# foo bar another_annoyingly_long_name
#
# 1 1 10 25.5
code for ..flx
..flx
outputs a function that replaces its formula arguments such as a = ~mean(.)
by calls a = purrr::as_mapper(~mean(.))(a)
before running. Convenient with summarize
and mutate
because a column cannot be a formula so there can't be any conflict.
I like to use the dollar notation as a shorthand and to have names starting with ..
so I can name those "tags" (and give them a class "tag"
) and see them as different objects (still experimenting with this). ..flx(summarize)(...)
will work as well though.
..flx <- function(fun){
function(...){
mc <- match.call()
mc[[1]] <- tail(mc[[1]],1)[[1]]
mc[] <- imap(mc,~if(is.call(.) && identical(.[[1]],quote(`~`))) {
rlang::expr(purrr::as_mapper(!!.)(!!sym(.y)))
} else .)
eval.parent(mc)
}
}
class(..flx) <- "tag"
`$.tag` <- function(e1, e2){
# change original call so x$y, which is `$.tag`(tag=x, data=y), becomes x(y)
mc <- match.call()
mc[[1]] <- mc[[2]]
mc[[2]] <- NULL
names(mc) <- NULL
# evaluate it in parent env
eval.parent(mc)
}