There are a couple of issues about this on the dplyr Github repo already, and at least one related SO question, but none of them quite covers my question -- I think.
Yet another option could be to use the purrr::map
family of functions.
If you replace rbind
with dplyr::bind_rows
in the get_binCI
function:
library(tidyverse)
dd <- data.frame(x = c(3, 4), n = c(10, 11))
get_binCI <- function(x, n) {
bind_rows(setNames(c(binom.test(x, n)$conf.int), c("lwr", "upr")))
}
You can use purrr::map2
with tidyr::unnest
:
dd %>% mutate(result = map2(x, n, get_binCI)) %>% unnest()
#> x n lwr upr
#> 1 3 10 0.06673951 0.6524529
#> 2 4 11 0.10926344 0.6920953
Or purrr::map2_dfr
with dplyr::bind_cols
:
dd %>% bind_cols(map2_dfr(.$x, .$n, get_binCI))
#> x n lwr upr
#> 1 3 10 0.06673951 0.6524529
#> 2 4 11 0.10926344 0.6920953
Here's a quick solution using data.table
package instead
First, a little change to the function
get_binCI <- function(x,n) as.list(setNames(binom.test(x,n)$conf.int, c("lwr", "upr")))
Then, simply
library(data.table)
setDT(dd)[, get_binCI(x, n), by = .(x, n)]
# x n lwr upr
# 1: 3 10 0.06673951 0.6524529
# 2: 4 11 0.10926344 0.6920953
Old question (with plenty of good answers), but this is a great use case for tidyverse's broom package, which deals with tidying output from test and modeling objects (such as binom.test
, lm
, etc).
It's more verbose than other methods, but I think it matches your desire for a more expressive approach.
The process is:
binom.test
on (in this case, those groups are defined by x
and n
) and nest
them, creating separate data.frames for each (within the full data.frame)map
the binom.test
call to the x
and n
values from each grouptidy
the binom.test
output for each group (this is where broom comes in)unnest
the tidied test output data.frames into the full data.frameNow you're left with a data.frame where each row contains the x
and n
values, combined with all of the output from the corresponding binom.test
, neatly formatted with separate columns for each bit of output information (point estimate, upper/lower conf, p-value, etc).
library(tidyverse)
library(broom)
dd <- data.frame(x=c(3,4),n=c(10,11))
dd %>%
group_by(x, n) %>%
nest() %>%
mutate(test = map(data, ~tidy(binom.test(x, n)))) %>%
unnest(test)
#> # A tibble: 2 x 11
#> # Groups: x, n [2]
#> x n data estimate statistic p.value parameter conf.low conf.high
#> <dbl> <dbl> <lis> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 3 10 <tib… 0.3 3 0.344 10 0.0667 0.652
#> 2 4 11 <tib… 0.364 4 0.549 11 0.109 0.692
#> # … with 2 more variables: method <chr>, alternative <chr>
From here you can get to your exact desired format with just a bit more manipulation, selecting the desired output variables, and renaming them:
dd %>%
group_by(x, n) %>%
nest() %>%
mutate(test = map(data, ~tidy(binom.test(x, n)))) %>%
unnest(test) %>%
rename(lwr = conf.low, upr = conf.high) %>%
select(x, n, lwr, upr)
#> # A tibble: 2 x 4
#> # Groups: x, n [2]
#> x n lwr upr
#> <dbl> <dbl> <dbl> <dbl>
#> 1 3 10 0.0667 0.652
#> 2 4 11 0.109 0.692
As mentioned, it's verbose. Much more so than (for example) @joran's beautifully succinct
dd %>%
group_by(x,n) %>%
do(foo(.$x,.$n))
However, the benefit of the broom approach is that you won't need to define a function foo
(or get_binCI
). It's fully self-contained, and in my opinion far more expressive and flexible.
Here are some possibilities with rowwise
and nesting
.
library("dplyr")
library("tidyr")
data frame with repeated x/n combinations, for fun
dd <- data.frame(x=c(3, 4, 3), n=c(10, 11, 10))
a versions of the CI function that returns a data frame, like @Joran's
get_binCI_df <- function(x,n) {
binom.test(x, n)$conf.int %>%
setNames(c("lwr", "upr")) %>%
as.list() %>% as.data.frame()
}
Grouping by x
and n
as before, removes the duplicate.
dd %>% group_by(x,n) %>% do(get_binCI_df(.$x,.$n))
# # A tibble: 2 x 4
# # Groups: x, n [2]
# x n lwr upr
# <dbl> <dbl> <dbl> <dbl>
# 1 3 10 0.1181172 0.8818828
# 2 4 11 0.1092634 0.6920953
Using rowwise
keeps all the rows but removes x
and n
unless you put them back using cbind(.
(like Ben does in his OP).
dd %>% rowwise() %>% do(cbind(., get_binCI_df(.$x,.$n)))
# Source: local data frame [3 x 4]
# Groups: <by row>
#
# # A tibble: 3 x 4
# x n lwr upr
# * <dbl> <dbl> <dbl> <dbl>
# 1 3 10 0.06673951 0.6524529
# 2 4 11 0.10926344 0.6920953
# 3 3 10 0.06673951 0.6524529
It feels like nesting could work more cleanly, but this is as good as I can get. Using mutate
means I can use x
and n
directly instead of .$x
and .$n
, but mutate expects a single value, so it needs to be wrapped in list
.
dd %>% rowwise() %>% mutate(ci=list(get_binCI_df(x, n))) %>% unnest()
# # A tibble: 3 x 4
# x n lwr upr
# <dbl> <dbl> <dbl> <dbl>
# 1 3 10 0.06673951 0.6524529
# 2 4 11 0.10926344 0.6920953
# 3 3 10 0.06673951 0.6524529
Finally, looks like something like this is an open issue (as of 5 Oct 2017) for dplyr; see https://github.com/tidyverse/dplyr/issues/2326; if something like that is implemented then that will be the easiest way!
Yet another variant, although I think we're all splitting hairs here.
> dd <- data.frame(x=c(3,4),n=c(10,11))
> get_binCI <- function(x,n) {
+ as_data_frame(setNames(as.list(binom.test(x,n)$conf.int),c("lwr","upr")))
+ }
>
> dd %>%
+ group_by(x,n) %>%
+ do(get_binCI(.$x,.$n))
Source: local data frame [2 x 4]
Groups: x, n
x n lwr upr
1 3 10 0.06673951 0.6524529
2 4 11 0.10926344 0.6920953
Personally, if we're just going by readability, I find this preferable:
foo <- function(x,n){
bi <- binom.test(x,n)$conf.int
data_frame(lwr = bi[1],
upr = bi[2])
}
dd %>%
group_by(x,n) %>%
do(foo(.$x,.$n))
...but now we're really splitting hairs.
This uses a "standard" dplyr workflow, but as @BenBolker notes in the comments, it requires calling get_binCI
twice:
dd %>% group_by(x,n) %>%
mutate(lwr=get_binCI(x,n)[1],
upr=get_binCI(x,n)[2])
x n lwr upr
1 3 10 0.06673951 0.6524529
2 4 11 0.10926344 0.6920953