I\'ve been trying to figure out a way to generate all distinct size-n partitions of a multiset, but so far have come up empty handed. First let me show what I\'m trying to archi
Here's a working solution that makes use of the next_combination
function presented by Hervé Brönnimann in N2639. The comments should make it pretty self-explanatory. The "herve/combinatorics.hpp" file contains the code listed in N2639 inside the herve
namespace. It's in C++11/14, converting to an older standard should be pretty trivial.
Note that I only quickly tested the solution. Also, I extracted it from a class-based implementation just a couple of minutes ago, so some extra bugs might have crept in. A quick initial test seems to confirm it works, but there might be corner cases for which it won't.
#include <cstdint>
#include <iterator>
#include "herve/combinatorics.hpp"
template <typename BidirIter>
bool next_combination_partition (BidirIter const & startIt,
BidirIter const & endIt, uint32_t const groupSize) {
// Typedefs
using tDiff = typename std::iterator_traits<BidirIter>::difference_type;
// Skip the last partition, because is consists of the remaining elements.
// Thus if there's 2 groups or less, the start should be at position 0.
tDiff const totalLength = std::distance(startIt, endIt);
uint32_t const numTotalGroups = std::max(static_cast<uint32_t>((totalLength - 1) / groupSize + 1), 2u);
uint32_t curBegin = (numTotalGroups - 2) * groupSize;
uint32_t const lastGroupBegin = curBegin - 1;
uint32_t curMid = curBegin + groupSize;
bool atStart = (totalLength != 0);
// Iterate over combinations from back of list to front. If a combination ends
// up at its starting value, update the previous one as well.
for (; (curMid != 0) && (atStart);
curMid = curBegin, curBegin -= groupSize) {
// To prevent duplicates, first element of each combination partition needs
// to be fixed. So move start iterator to the next element. This is not true
// for the starting (2nd to last) group though.
uint32_t const startIndex = std::min(curBegin + 1, lastGroupBegin + 1);
auto const iterStart = std::next(startIt, startIndex);
auto const iterMid = std::next(startIt, curMid);
atStart = !herve::next_combination(iterStart, iterMid, endIt);
}
return !atStart;
}
Edit Below is my quickly thrown together test code ("combopart.hpp" obviously being the file containing the above function).
#include "combopart.hpp"
#include <algorithm>
#include <cstdint>
#include <iostream>
#include <iterator>
#include <vector>
int main (int argc, char* argv[]) {
uint32_t const groupSize = 2;
std::vector<uint32_t> v;
v = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
v = {0, 0, 0, 1, 1, 1, 2, 2, 2, 3};
v = {1, 1, 2, 2};
// Make sure contents are sorted
std::sort(v.begin(), v.end());
uint64_t count = 0;
do {
++count;
std::cout << "[ ";
uint32_t elemCount = 0;
for (auto it = v.begin(); it != v.end(); ++it) {
std::cout << *it << " ";
elemCount++;
if ((elemCount % groupSize == 0) && (it != std::prev(v.end()))) {
std::cout << "| ";
}
}
std::cout << "]" << std::endl;
} while (next_combination_partition(v.begin(), v.end(), groupSize));
std::cout << std::endl << "# elements: " << v.size() << " - group size: " <<
groupSize << " - # combination partitions: " << count << std::endl;
return 0;
}
Edit 2 Improved algorithm. Replaced early exit branch with combination of conditional move (using std::max
) and setting atStart
boolean to false. Untested though, be warned.
Edit 3 Needed an extra modification so as not to "fix" the first element in the 2nd to last partition. The additional code should compile as a conditional move, so there should be no branching cost associated with it.
P.S.: I am aware that the code to generate combinations by @Howard Hinnant (available at https://howardhinnant.github.io/combinations.html) is much faster than the one by Hervé Brönnimann. However, that code can not handle duplicates in the input (because as far as I can see, it never even dereferences an iterator), which my problem explicitly requires. On the other hand, if you know for sure your input won't contain duplicates, it is definitely the code you want use with my function above.
A recursive algorithm to distribute the elements one-by-one could be based on a few simple rules:
{A,B,D,C,C,D,B,A,C} -> {A,A,B,B,D,D,C,C,C}
{ , , } { , , } { , , }
{A, , } { , , } { , , }
^dup^
{A, , } {A, , } {A, , }
^dup^ ^dup^
partial solution: {A, , } {A, , } { , , }
^dup^
insert element B: {A,B, } {A, , } { , , }
{A, , } {A, , } {B, , }
partial solution: {A, , } {B, , } { , , }
insert another B: {A,B, } {B, , } { , , } <- ILLEGAL
{A, , } {B,B, } { , , } <- OK
{A, , } {B, , } {B, , } <- OK
partial solution: {A, , } {A, , } {B,B, }
insert first D: {A,D, } {A, , } {B,B, } <- OK
{A, , } {A, , } {B,B,D} <- ILLEGAL (NO SPACE FOR 2ND D)
partial solution: {A,A, } {B,B,D} {D, , }
insert C,C,C: {A,A,C} {B,B,D} {D,C,C}
So the algorithm would be something like this:
// PREPARATION
Sort or group input. // {A,B,D,C,C,D,B,A,C} -> {A,A,B,B,D,D,C,C,C}
Create empty partial solution. // { , , } { , , } { , , }
Start recursion with empty partial solution and index at start of input.
// RECURSION
Receive partial solution, index, group size and last-used block.
If group size is zero:
Find group size of identical elements in input, starting at index.
Set last-used block to first block.
Find empty places in partial solution, starting at last-used block.
If index is at last group in input:
Fill empty spaces with elements of last group.
Store complete solution.
Return from recursion.
Mark duplicate blocks in partial solution.
For each block in partial solution, starting at last-used block:
If current block is not a duplicate, and has empty places,
and the places left in current and later blocks is not less than the group size:
Insert element into copy of partial solution.
Recurse with copy, index + 1, group size - 1, current block.
I tested a simple JavaScript implementation of this algorithm, and it gives the correct output.
Here's my pencil and paper algorithm:
Describe the multiset in item quantities, e.g., {(1,2),(2,2)}
f(multiset,result):
if the multiset is empty:
return result
otherwise:
call f again with each unique distribution of one element added to result and
removed from the multiset state
Example:
{(1,2),(2,2),(3,2)} n = 2
11 -> 11 22 -> 11 22 33
11 2 2 -> 11 23 23
1 1 -> 12 12 -> 12 12 33
12 1 2 -> 12 13 23
Example:
{(1,2),(2,2),(3,2)} n = 3
11 -> 112 2 -> 112 233
11 22 -> 113 223
1 1 -> 122 1 -> 122 133
12 12 -> 123 123
Let's solve the problem commented below by m69 of dealing with potential duplicate distribution:
{A,B,B,C,C,D,D,D,D}
We've reached {A, , }{B, , }{B, , }, have 2 C's to distribute
and we'd like to avoid `ac bc b` generated along with `ac b bc`.
Because our generation in the level just above is ordered, the series of identical
counts will be continuous. When a series of identical counts is encountered, make
the assignment for the whole block of identical counts (rather than each one),
and partition that contribution in descending parts; for example,
| identical |
ac b b
ac bc b // descending parts [1,0]
Example of longer block:
| identical block | descending parts
ac bcccc b b b // [4,0,0,0]
ac bccc bc b b // [3,1,0,0]
ac bcc bcc b b // [2,2,0,0]
...