Grouping Over All Possible Combinations of Several Variables With dplyr

后端 未结 4 1777
遇见更好的自我
遇见更好的自我 2021-01-02 15:49

Given a situation such as the following

library(dplyr)
myData <- tbl_df(data.frame( var1 = rnorm(100), 
                             var2 = letters[1:3] %         


        
4条回答
  •  礼貌的吻别
    2021-01-02 16:30

    Using unite to create a new column is the simplest way

    library(tidyverse)
    df = tibble(
      a = c(1,1,2,2,1,1,2,2),
      b = c(3,4,3,4,3,4,3,4),
      val = c(1,2,3,4,5,6,7,8)
    )
    print(df)#output1
    df_2 = unite(df, 'combined_header', a, b, sep='_', remove=FALSE) #remove=F doesn't remove existing columns
    print(df_2)#output2
    
    df_2 %>% group_by(combined_header) %>%
      summarize(avg_val=mean(val)) %>% print()#output3
    #avg 1_3 = mean(1,5)=3 avg 1_4 = mean(2, 6) = 4
    

    RESULTS

    Output:
    output1
     a     b   val
        
    1     1     3     1
    2     1     4     2
    3     2     3     3
    4     2     4     4
    5     1     3     5
    6     1     4     6
    7     2     3     7
    8     2     4     8
    
    output2
      combined_header     a     b   val
                   
    1 1_3                 1     3     1
    2 1_4                 1     4     2
    3 2_3                 2     3     3
    4 2_4                 2     4     4
    5 1_3                 1     3     5
    6 1_4                 1     4     6
    7 2_3                 2     3     7
    8 2_4                 2     4     8
    
    output3
    combined_header avg_val
                   
    1 1_3                   3
    2 1_4                   4
    3 2_3                   5
    4 2_4                   6
    

提交回复
热议问题