I want to create a list for my classroom of every possible group of 4 students. If I have 20 students, how I can I create this, by group, in R where my rows are each combination
This is a challenging problem computationally, since I believe there are 2.5 billion possibilities to enumerate. (If it's mistaken, I'd welcome any insight about where this approach goes wrong.)
Depending on how it's stored, a table with all those groupings might require more RAM than most computers can handle. I'd be impressed to see an efficient way to create that. If we took a "create one combination at a time" approach, it would still take 41 minutes to generate all the possibilities if we could generate 1,000,000 per second, or a month if we could only generate 1,000 per second.
EDIT - added partial implementation at the bottom to create any desired grouping from #1 to #2,546,168,625. For some purposes, this may be almost as good as actually storing the whole sequence, which is very large.
Let's say we are going to make 5 groups of four students each: Group A, B, C, D, and E.
Let's define Group A as the group Student #1 is in. They can be paired with any three of the other 19 students. I believe there are 969 such combinations of other students:
> nrow(t(combn(1:19, 3)))
[1] 969
Now there are now 16 students left for other groups. Let's assign the first student not already in Group A into Group B. That might be student 2, 3, 4, or 5. It doesn't matter; all we need to know is that there are only 15 students that can be paired with that student. There are 455 such combinations:
> nrow(t(combn(1:15, 3)))
[1] 455
Now there are 12 student left. Again, let's assign the first ungrouped student to Group C, and we have 165 combinations left for them with the other 11 students:
> nrow(t(combn(1:11, 3)))
[1] 165
And we have 8 students left, 7 of whom can be paired with first ungrouped student into Group D in 35 ways:
> nrow(t(combn(1:7, 3)))
[1] 35
And then, once our other groups are determined, there's only one group of four students left, three of whom can be paired with the first ungrouped student:
> nrow(t(combn(1:3, 3)))
[1] 1
That implies 2.546B combinations:
> 969*455*165*35*1
[1] 2546168625
Here's a work-in-progress function that produces a grouping based on any arbitrary sequence number.
1) [in progress] Convert sequence number to a vector describing which # combination should be used for Group A, B, C, D, and E. For instance, this should convert #1 to c(1, 1, 1, 1, 1)
and #2,546,168,625 to c(969, 455, 165, 35, 1)
.
2) Convert the combinations to a specific output describing the students in each Group.
groupings <- function(seq_nums) {
students <- 20
group_size = 4
grouped <- NULL
remaining <- 1:20
seq_nums_pad <- c(seq_nums, 1) # Last group always uses the only possible combination
for (g in 1:5) {
group_relative <-
c(1, 1 + t(combn(1:(length(remaining) - 1), group_size - 1))[seq_nums_pad[g], ])
group <- remaining[group_relative]
print(group)
grouped = c(grouped, group)
remaining <- setdiff(remaining, grouped)
}
}
> groupings(c(1,1,1,1))
#[1] 1 2 3 4
#[1] 5 6 7 8
#[1] 9 10 11 12
#[1] 13 14 15 16
#[1] 17 18 19 20
> groupings(c(1,1,1,2))
#[1] 1 2 3 4
#[1] 5 6 7 8
#[1] 9 10 11 12
#[1] 13 14 15 17
#[1] 16 18 19 20
> groupings(c(969, 455, 165, 35)) # This one uses the last possibility for
#[1] 1 18 19 20 # each grouping.
#[1] 2 15 16 17
#[1] 3 12 13 14
#[1] 4 9 10 11
#[1] 5 6 7 8