Summarize with dplyr “other then” groups

后端 未结 2 786
感情败类
感情败类 2021-01-20 14:38

I need to summarize in a grouped data_frame (warn: a solution with dplyr is very much appreciated but isn\'t mandatory) both something on each group (simple) and the same so

相关标签:
2条回答
  • 2021-01-20 14:43

    I don't think it is in general possible to perform operations on other groups within summarise() (i.e. I think the other groups are not "visible" when summarising a certain group). You can define your own functions and use them in mutate to apply them to a certain variable. For your updated example you can use

    calc_med_other <- function(x) sapply(seq_along(x), function(i) median(x[-i]))
    calc_med_before <- function(x) sapply(seq_along(x), function(i) ifelse(i == 1, NA, median(x[seq(i - 1)])))
    
    df %>%
        group_by(group) %>%
        summarize(med = median(value)) %>%
        mutate(
            med_other = calc_med_other(med),
            med_before = calc_med_before(med)
        )
    #   group   med med_other med_before
    #   (chr) (dbl)     (dbl)      (dbl)
    #1     a   1.5       4.5         NA
    #2     b   3.5       3.5        1.5
    #3     c   5.5       2.5        2.5
    
    0 讨论(0)
  • 2021-01-20 15:03

    Here's my solution:

    res <- df %>%
      group_by(group) %>%
      summarise(med_group = median(value),
                med_other = (median(df$value[df$group != group]))) %>% 
      mutate(med_before = lag(med_group))
    
    > res
    Source: local data frame [3 x 4]
    
          group med_group med_other med_before
      (chr)     (dbl)     (dbl)      (dbl)
    1     a       1.5       4.5         NA
    2     b       3.5       3.5        1.5
    3     c       5.5       2.5        3.5
    

    I was trying to come up with an all-dplyr solution but base R subsetting works just fine with median(df$value[df$group != group]) returning the median of all observations that are not in the current group.

    I hope this help you to solve your problem.

    0 讨论(0)
提交回复
热议问题