I want to create a list for my classroom of every possible group of 4 students. If I have 20 students, how I can I create this, by group, in R where my rows are each combination
Currently, this is implemented in the development version of . This is now officially apart of the production version of RcppAlgos
and will be in the next official release on CRANRcppAlgos
*.
library(RcppAlgos)
a <- comboGroups(10, numGroups = 2, retType = "3Darray")
dim(a)
[1] 126 5 2
a[1,,]
Grp1 Grp2
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
a[126,,]
Grp1 Grp2
[1,] 1 2
[2,] 7 3
[3,] 8 4
[4,] 9 5
[5,] 10 6
Or if you prefer matrices:
a1 <- comboGroups(10, 2, retType = "matrix")
head(a1)
Grp1 Grp1 Grp1 Grp1 Grp1 Grp2 Grp2 Grp2 Grp2 Grp2
[1,] 1 2 3 4 5 6 7 8 9 10
[2,] 1 2 3 4 6 5 7 8 9 10
[3,] 1 2 3 4 7 5 6 8 9 10
[4,] 1 2 3 4 8 5 6 7 9 10
[5,] 1 2 3 4 9 5 6 7 8 10
[6,] 1 2 3 4 10 5 6 7 8 9
It is also really fast. You can even generate in parallel with nThreads
or Parallel = TRUE
(the latter uses one minus the system max threads) for greater efficiency gains:
comboGroupsCount(16, 4)
[1] 2627625
system.time(comboGroups(16, 4, "matrix"))
user system elapsed
0.107 0.030 0.137
system.time(comboGroups(16, 4, "matrix", nThreads = 4))
user system elapsed
0.124 0.067 0.055
## 7 threads on my machine
system.time(comboGroups(16, 4, "matrix", Parallel = TRUE))
user system elapsed
0.142 0.126 0.047
A really nice feature is the ability to generate samples or specific lexicographical combination groups, especially when the number of results is large.
comboGroupsCount(factor(state.abb), numGroups = 10)
Big Integer ('bigz') :
[1] 13536281554808237495608549953475109376
mySamp <- comboGroupsSample(factor(state.abb),
numGroups = 10, "3Darray", n = 5, seed = 42)
mySamp[1,,]
Grp1 Grp2 Grp3 Grp4 Grp5 Grp`6 Grp7 Grp8 Grp9 Grp10
[1,] AL AK AR CA CO CT DE FL LA MD
[2,] IA AZ ME ID GA OR IL IN MS NM
[3,] KY ND MO MI HI PA MN KS MT OH
[4,] TX RI SC NH NV WI NE MA NY TN
[5,] VA VT UT OK NJ WY WA NC SD WV
50 Levels: AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM NV NY OH ... WY
firstAndLast <- comboGroupsSample(state.abb, 10, "3Darray",
sampleVec = c("1",
"13536281554808237495608549953475109376"))
firstAndLast[1,,]
Grp1 Grp2 Grp3 Grp4 Grp5 Grp6 Grp7 Grp8 Grp9 Grp10
[1,] "AL" "CO" "HI" "KS" "MA" "MT" "NM" "OK" "SD" "VA"
[2,] "AK" "CT" "ID" "KY" "MI" "NE" "NY" "OR" "TN" "WA"
[3,] "AZ" "DE" "IL" "LA" "MN" "NV" "NC" "PA" "TX" "WV"
[4,] "AR" "FL" "IN" "ME" "MS" "NH" "ND" "RI" "UT" "WI"
[5,] "CA" "GA" "IA" "MD" "MO" "NJ" "OH" "SC" "VT" "WY"
firstAndLast[2,,]
Grp1 Grp2 Grp3 Grp4 Grp5 Grp6 Grp7 Grp8 Grp9 Grp10
[1,] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "FL" "GA"
[2,] "WA" "TX" "RI" "OH" "NM" "NE" "MN" "ME" "IA" "HI"
[3,] "WV" "UT" "SC" "OK" "NY" "NV" "MS" "MD" "KS" "ID"
[4,] "WI" "VT" "SD" "OR" "NC" "NH" "MO" "MA" "KY" "IL"
[5,] "WY" "VA" "TN" "PA" "ND" "NJ" "MT" "MI" "LA" "IN"
And finally, generating all 2,546,168,625
combinations groups of 20 people into 5 groups (what the OP asked for) can be achieved in under a minute using the lower
and upper
arguments:
system.time(aPar <- parallel::mclapply(seq(1, 2546168625, 969969), function(x) {
combs <- comboGroups(20, 5, "3Darray", lower = x, upper = x + 969968)
### do something
dim(combs)
}, mc.cores = 6))
user system elapsed
217.667 22.932 48.482
sum(sapply(aPar, "[", 1))
[1] 2546168625
Although I started working on this problem over a year ago, this question was a huge inspiration for getting this formalized in a package.
* I am the author of RcppAlgos