Create Combinations in R by Groups

后端 未结 7 1054
一整个雨季
一整个雨季 2021-02-08 19:23

I want to create a list for my classroom of every possible group of 4 students. If I have 20 students, how I can I create this, by group, in R where my rows are each combination

7条回答
  •  伪装坚强ぢ
    2021-02-08 20:00

    This relies heavily on this answer:

    Algorithm that can create all combinations and all groups of those combinations

    One thing to note is that the answer is not that dynamic - it only included a solution for groups of 3. To make it more robust, we can create the code based on the input parameters. That is, the following recursive function is created on the fly for groups 3:

    group <- function(input, step){
     len <- length(input) 
     combination[1, step] <<- input[1] 
    
     for (i1 in 2:(len-1)) { 
       combination[2, step] <<- input[i1] 
    
       for (i2 in (i1+1):(len-0)) { 
         combination[3, step] <<- input[i2] 
    
         if (step == m) { 
           print(z); result[z, ,] <<- combination 
           z <<- z+1 
         } else { 
           rest <- setdiff(input, input[c(i1,i2, 1)]) 
           group(rest, step +1) #recursive if there are still additional possibilities
       }} 
     } 
    }
    

    This takes around 55 seconds to run for N = 16 and k = 4. I'd like to translate it into Rcpp but unfortunately I do not have that skillset.

    group_N <- function(input, k = 2) {
      N = length(input)
      m = N/k
      combos <- factorial(N) / (factorial(k)^m * factorial(m))
    
      result <- array(NA_integer_, dim = c(combos, m, k))
      combination = matrix(NA_integer_, nrow = k, ncol = m)
    
      z = 1
    
      group_f_start = 'group <- function(input, step){\n len <- length(input) \n combination[1,  step] <<- input[1] \n '
      i_s <- paste0('i', seq_len(k-1))
    
      group_f_fors = paste0('for (', i_s, ' in ', c('2', if (length(i_s) != 1) {paste0('(', i_s[-length(i_s)], '+1)')}), ':(len-', rev(seq_len(k)[-k])-1, ')) { \n combination[', seq_len(k)[-1], ', step] <<- input[', i_s, '] \n', collapse = '\n ')
    
      group_f_inner = paste0('if (step == m) { \n result[z, ,] <<- combination \n z <<- z+1 \n } else { \n rest <- setdiff(input, input[c(',
                             paste0(i_s, collapse = ','),
                             ', 1)]) \n group(rest, step +1) \n }')
    
      eval(parse(text = paste0(group_f_start, group_f_fors, group_f_inner, paste0(rep('}', times = k), collapse = ' \n '))))
    
      group(input, 1)
      return(result)
    }
    

    Performance

    system.time({test_1 <- group_N(seq_len(4), 2)})
    #   user  system elapsed 
    #   0.01    0.00    0.02
    library(data.table)
    
    #this funky step is just to better show the groups. the provided
    ## array is fine.
    
    as.data.table(t(rbindlist(as.data.table(apply(test_1, c(1,3), list)))))
    #    V1  V2
    #1: 1,2 3,4
    #2: 1,3 2,4
    #3: 1,4 2,3
    
    system.time({test_1 <- group_N(seq_len(16), 4)})
    #   user  system elapsed 
    #  55.00    0.19   55.29 
    
    as.data.table(t(rbindlist(as.data.table(apply(test_1, c(1,3), list)))))
    #very slow
    #                  V1          V2          V3          V4
    #      1:     1,2,3,4     5,6,7,8  9,10,11,12 13,14,15,16
    #      2:     1,2,3,4     5,6,7,8  9,10,11,13 12,14,15,16
    #      3:     1,2,3,4     5,6,7,8  9,10,11,14 12,13,15,16
    #      4:     1,2,3,4     5,6,7,8  9,10,11,15 12,13,14,16
    #      5:     1,2,3,4     5,6,7,8  9,10,11,16 12,13,14,15
    #     ---                                                
    #2627621:  1,14,15,16  2,11,12,13  3, 6, 9,10     4,5,7,8
    #2627622:  1,14,15,16  2,11,12,13     3,7,8,9  4, 5, 6,10
    #2627623:  1,14,15,16  2,11,12,13  3, 7, 8,10     4,5,6,9
    #2627624:  1,14,15,16  2,11,12,13  3, 7, 9,10     4,5,6,8
    #2627625:  1,14,15,16  2,11,12,13  3, 8, 9,10     4,5,6,7
    

提交回复
热议问题