I would like to add overall summary rows while also calculating summaries by group using dplyr. I have found various questions asking how to do this, e.g. here, here, and he
One option is with do
mtcars %>%
count(cyl, gear) %>%
ungroup() %>%
mutate(cyl=as.character(cyl)) %>%
do(bind_rows(., data.frame(cyl="Total", count(mtcars, gear))))
#or replace the last 'do' step with
#bind_rows(cbind(cyl='Total', count(mtcars, gear))) #from @JonnyPolonsky's comments
# cyl gear n
# <chr> <dbl> <int>
#1 4 3 1
#2 4 4 8
#3 4 5 2
#4 6 3 2
#5 6 4 4
#6 6 5 1
#7 8 3 12
#8 8 5 2
#9 Total 3 15
#10 Total 4 12
#11 Total 5 5
If you would like to have truly universal solution you could use a combination of purrr::map_df, base::c and base::sum
mtcars %>%
purrr::map_df(~c(.x, sum(.x, na.rm=TRUE))) %>%
tail
P.S. All columns must be numeric!
With adorn_totals() from the janitor package:
library(janitor)
mtcars %>%
tabyl(cyl, gear) %>%
adorn_totals("row")
cyl 3 4 5
4 1 8 2
6 2 4 1
8 12 0 2
Total 15 12 5
To get from there to the "long" form in your post, add tidyr::gather()
to the pipeline:
mtcars %>%
tabyl(cyl, gear) %>%
adorn_totals("row") %>%
tidyr::gather(gear, n, 2:ncol(.), convert = TRUE)
cyl gear n
1 4 3 1
2 6 3 2
3 8 3 12
4 Total 3 15
5 4 4 8
6 6 4 4
7 8 4 0
8 Total 4 12
9 4 5 2
10 6 5 1
11 8 5 2
12 Total 5 5
Self-promotion alert, I authored this package - adding this answer b/c it's a genuinely efficient solution here.
Here's a take on the accepted answer, using new functions introduced in dplyr 1.0.0 and tidyr 1.0.0.
We pivot the counts using the new tidyr::pivot_wider
. Then use the new dplyr::rowwise
and dplyr::c_across
to sum the counts for the total column.
We can also use tidyr::pivot_longer
to get in desired long format.
library(dplyr, warn.conflicts = FALSE)
library(tidyr)
cyl_gear_sum <- mtcars %>%
count(cyl, gear) %>%
pivot_wider(names_from = gear, values_from = n, values_fill = list(n = 0)) %>%
rowwise(cyl) %>%
mutate(gear_total = sum(c_across()))
cyl_gear_sum
#> # A tibble: 3 x 5
#> # Rowwise: cyl
#> cyl `3` `4` `5` gear_total
#> <dbl> <int> <int> <int> <int>
#> 1 4 1 8 2 11
#> 2 6 2 4 1 7
#> 3 8 12 0 2 14
# total as row
cyl_gear_sum %>%
pivot_longer(-cyl, names_to = "gear", values_to = "n")
#> # A tibble: 12 x 3
#> cyl gear n
#> <dbl> <chr> <int>
#> 1 4 3 1
#> 2 4 4 8
#> 3 4 5 2
#> 4 4 gear_total 11
#> 5 6 3 2
#> 6 6 4 4
#> 7 6 5 1
#> 8 6 gear_total 7
#> 9 8 3 12
#> 10 8 4 0
#> 11 8 5 2
#> 12 8 gear_total 14
Created on 2020-04-07 by the reprex package (v0.3.0)
An addition to @arkrun's answer that is not easy to add as a comment:
Although a little more complex, this format allows for previous modifications in the data frame. Useful when there is a longer chain of verbs before the table is generated. (You want to change names, or select only specific variables)
mtcars %>%
count(cyl, gear) %>%
ungroup() %>%
mutate(cyl=as.character(cyl))
bind_rows(group_by(.,gear) %>%
summarise(n=sum(n)) %>%
mutate(cyl='Total')) %>%
spread(cyl)
## A tibble: 3 x 5
# gear `4` `6` `8` Total
#* <dbl> <dbl> <dbl> <dbl> <dbl>
#1 3 1 2 12 15
#2 4 8 4 0 12
#3 5 2 1 2 5
This can also be doubled up to generate a total row for the spread as well.
mtcars %>%
count(cyl, gear) %>%
ungroup() %>%
mutate(cyl=as.character(cyl),
gear = as.character(gear)) %>%
bind_rows(group_by(.,gear) %>%
summarise(n=sum(n)) %>%
mutate(cyl='Total')) %>%
bind_rows(group_by(.,cyl) %>%
summarise(n=sum(n)) %>%
mutate(gear='Total')) %>%
spread(cyl,n,fill=0)
# A tibble: 4 x 5
gear `4` `6` `8` Total
* <chr> <dbl> <dbl> <dbl> <dbl>
1 3 1 2 12 15
2 4 8 4 0 12
3 5 2 1 2 5
4 Total 11 7 14 32
Here is my suggestion.
Note. If grouping variables are numeric, they will not be dropped in step 3 - I therefore mutate them to character variables.
powerSetList <- function(df, ...) {
rje::powerSet(x = c(...))[-1] %>% lapply(function(x, tdf = df) group_by(tdf, .dots=x)) %>% c(list(tibble(df)), .)
}
mtcars %>%
mutate_at(vars("cyl", "gear"), as.character) %>%
powerSetList("cyl", "gear") %>%
map(~summarise_if(., is.numeric, .funs = mean)) %>%
bind_rows() %>%
replace_na(list(gear = "all gears",
cyl = "all cyls"))