I\'m trying to solve the following problem in R: I have a dataframe with two variables (number of successes, and number of total trials).
# A tibble: 4 x 2
We can use pmap
after changing the column names with the arguments of 'prop.test'
pmap(setNames(df, c("x", "n")), prop.test)
Or using map2
map2(df$Success, df$N, prop.test)
The problem with map
is that it is looping through each of the columns of the dataset and it is a list
of vector
s
df %>%
map(~ .x)
#$Success
#[1] 38 12 27 9
#$N
#[1] 50 50 50 50
So, we cannot do .x$Success
or .x$N
As @Steven Beaupre mentioned, if we need to create new columns with p-value and confidence interval
res <- df %>%
mutate(newcol = map2(Success, N, prop.test),
pval = map_dbl(newcol, ~ .x[["p.value"]]),
CI = map(newcol, ~ as.numeric(.x[["conf.int"]]))) %>%
select(-newcol)
# A tibble: 4 x 4
# Success N pval CI
# <dbl> <dbl> <dbl> <list>
#1 38.0 50.0 0.000407 <dbl [2]>
#2 12.0 50.0 0.000407 <dbl [2]>
#3 27.0 50.0 0.671 <dbl [2]>
#4 9.00 50.0 0.0000116 <dbl [2]>
The 'CI' column is a list
of 2 elements, which can be unnest
ed to make it a 'long' format data
res %>%
unnest
Or create 3 columns
df %>%
mutate(newcol = map2(Success, N, ~ prop.test(.x, n = .y) %>%
{tibble(pvalue = .[["p.value"]],
CI_lower = .[["conf.int"]][[1]],
CI_upper = .[["conf.int"]][[2]])})) %>%
unnest
# A tibble: 4 x 5
# Success N pvalue CI_lower CI_upper
# <dbl> <dbl> <dbl> <dbl> <dbl>
#1 38.0 50.0 0.000407 0.615 0.865
#2 12.0 50.0 0.000407 0.135 0.385
#3 27.0 50.0 0.671 0.395 0.679
#4 9.00 50.0 0.0000116 0.0905 0.319
If you want a new column, you'd use @akrun's approach but sprinkle in a little dplyr
and broom
amongst the purrr
library(tidyverse) # for dplyr, purrr, tidyr & co.
library(broom)
analysis <- df %>%
set_names(c("x","n")) %>%
mutate(result = pmap(., prop.test)) %>%
mutate(result = map(result, tidy))
From there that gives you the results in a tidy nested tibble. If you want to just limit that to certain variables, you'd just follow the mutate
/map
applying functions to the nested frame, then unnest().
analysis %>%
mutate(result = map(result, ~select(.x, p.value, conf.low, conf.high))) %>%
unnest()
# A tibble: 4 x 5
x n p.value conf.low conf.high
<dbl> <dbl> <dbl> <dbl> <dbl>
1 38.0 50.0 0.000407 0.615 0.865
2 12.0 50.0 0.000407 0.135 0.385
3 27.0 50.0 0.671 0.395 0.679
4 9.00 50.0 0.0000116 0.0905 0.319