Calculate percent from total observations in r gtsummary::tbl_summary?

给你一囗甜甜゛ 提交于 2021-01-27 19:10:18

问题


Issue: In gtsummary the tbl_summary function calculates column percent out of the total non-missing observations. I would like gtsummary to calculate percent from the total of missing and non-missing observations.

Example from the gtsummary Table Gallery at http://www.danieldsjoberg.com/gtsummary/articles/gallery.html

trial[c("trt", "age", "grade")] %>%
  tbl_summary(
    by = trt, 
    missing = "no",
    statistic = all_continuous() ~ "{median} ({p25}, {p75}) [N = {N_nonmiss}]"
  ) %>%
  modify_header(stat_by = md("**{level}**<br>N =  {n} ({style_percent(p)}%)")) %>%
  add_n() %>%
  bold_labels() %>%
  modify_spanning_header(starts_with("stat_") ~ "**Chemotherapy Treatment**")

Grade has no missing observations so 35 people with Grade 1 disease in the Drug A group is 35/98 (36%).

Now, recoding Grade 3 to missing:

trial$grade[trial$grade %in% "III"] <- NA
trial$grade <- droplevels(trial$grade)

Re-run tbl_summary:

trial[c("trt", "age", "grade")] %>%
  tbl_summary(
    by = trt, 
    missing = "no",
    statistic = all_continuous() ~ "{median} ({p25}, {p75}) [N = {N_nonmiss}]"
  ) %>%
  modify_header(stat_by = md("**{level}**<br>N =  {n} ({style_percent(p)}%)")) %>%
  add_n() %>%
  bold_labels() %>%
  modify_spanning_header(starts_with("stat_") ~ "**Chemotherapy Treatment**")

Grade 1 is now expressed as n = 35 out of 67 (52%) non-missing obs. in the Drug A group. I would still like the percent to be expressed as 36% of 98 people. Is there a way to do this in gtsummary?


回答1:


I think the best way to get what you're looking for is to make the missing values explicit NA using the forcats::fct_explicit_na() function. When the NA value is a level of a factor, it'll be included in the denominator for percentage calculations.

library(gtsummary)
library(tidyverse)


trial %>%
  select(response, trt) %>%
  # make missing value explicit for categorical variables, using fct_explicit_na
  mutate(response = factor(response) %>% fct_explicit_na()) %>%
  # summarize data
  tbl_summary(by = trt)

Does this solution work for you?



来源:https://stackoverflow.com/questions/63640473/calculate-percent-from-total-observations-in-r-gtsummarytbl-summary

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!