Say that I have an array like the following:
Array
(
[arm] => Array
(
[0] => A
[1] => B
[2] => C
Here's what I could come up with:
function inject($elem, $array) {
return array_map(function ($n) use ($elem) { return array_merge((array)$elem, (array)$n); }, $array);
}
function zip($array1, $array2) {
return array_reduce($array1, function ($v, $n) use ($array2) { return array_merge($v, inject($n, $array2)); }, array());
}
function cartesian_product($array) {
$keys = array_keys($array);
$prod = array_shift($array);
$prod = array_reduce($array, 'zip', $prod);
return array_map(function ($n) use ($keys) { return array_combine($keys, $n); }, $prod);
}
(Using pseudo array/list/dictionary notation below since PHP is simply too verbose for such things.)
The inject
function transforms a, [b]
into [(a,b)]
, i.e. it injects a single value into each value of an array, returning an array of arrays. It doesn't matter whether a
or b
already is an array, it'll always return a two dimensional array.
inject('a', ['foo', 'bar'])
=> [('a', 'foo'), ('b', 'bar')]
The zip
function applies the inject
function to each element in an array.
zip(['a', 'b'], ['foo', 'bar'])
=> [('a', 'foo'), ('a', 'bar'), ('b', 'foo'), ('b', 'bar')]
Note that this actually produces a cartesian product, so zip
is a slight misnomer. Simply applying this function to all elements in a data set in succession gives you the cartesian product for an array of any length.
zip(zip(['a', 'b'], ['foo', 'bar']), ['42', '76'])
=> [('a', 'foo', '42'), ('a', 'foo', '76'), ('a', 'bar', '42'), …]
This does not contain the keys, but since the elements are all in order within the result set, you can simply re-inject the keys into the result.
array_combine(['key1', 'key2', 'key3'], ['a', 'foo', '42'])
=> [ key1 : 'a', key2 : 'foo', key3 : '42' ]
Applying this to all elements in the product gives the desired result.
You can collapse the above three functions into a single long statement if you wish (which would also clear up the misnomers).
An "unrolled" version without anonymous functions for PHP <= 5.2 would look like this:
function inject($elem, $array) {
$elem = (array)$elem;
foreach ($array as &$a) {
$a = array_merge($elem, (array)$a);
}
return $array;
}
function zip($array1, $array2) {
$prod = array();
foreach ($array1 as $a) {
$prod = array_merge($prod, inject($a, $array2));
}
return $prod;
}
function cartesian_product($array) {
$keys = array_keys($array);
$prod = array_shift($array);
$prod = array_reduce($array, 'zip', $prod);
foreach ($prod as &$a) {
$a = array_combine($keys, $a);
}
return $prod;
}
Here's a solution I wouldn't be ashamed to show.
Assume that we have an input array $input
with N
sub-arrays, as in your example. Each
sub-array has Cn
items, where n
is its index inside $input
, and its key is Kn
. I will refer to the i
th item of the n
th sub-array as Vn,i
.
The algorithm below can be proved to work (barring bugs) by induction:
1) For N = 1, the cartesian product is simply array(0 => array(K1 => V1,1), 1 => array(K1 => V1,2), ... )
-- C1 items in total. This can be done with a simple foreach
.
2) Assume that $result
already holds the cartesian product of the first N-1 sub-arrays. The cartesian product of $result
and the Nth sub-array can be produced this way:
3) In each item (array) inside $product
, add the value KN => VN,1
. Remember the resulting item (with the added value); I 'll refer to it as $item
.
4a) For each array inside $product
:
4b) For each value in the set VN,2 ... VN,CN
, add to $product
a copy of $item
, but change the value with the key KN
to VN,m
(for all 2 <= m <= CN
).
The two iterations 4a (over $product
) and 4b (over the Nth input sub-array) ends up with $result
having CN
items for every item it had before the iterations, so in the end $result
indeed contains the cartesian product of the first N sub arrays.
Therefore the algorithm will work for any N.
This was harder to write than it should have been. My formal proofs are definitely getting rusty...
function cartesian($input) {
$result = array();
while (list($key, $values) = each($input)) {
// If a sub-array is empty, it doesn't affect the cartesian product
if (empty($values)) {
continue;
}
// Seeding the product array with the values from the first sub-array
if (empty($result)) {
foreach($values as $value) {
$result[] = array($key => $value);
}
}
else {
// Second and subsequent input sub-arrays work like this:
// 1. In each existing array inside $product, add an item with
// key == $key and value == first item in input sub-array
// 2. Then, for each remaining item in current input sub-array,
// add a copy of each existing array inside $product with
// key == $key and value == first item of input sub-array
// Store all items to be added to $product here; adding them
// inside the foreach will result in an infinite loop
$append = array();
foreach($result as &$product) {
// Do step 1 above. array_shift is not the most efficient, but
// it allows us to iterate over the rest of the items with a
// simple foreach, making the code short and easy to read.
$product[$key] = array_shift($values);
// $product is by reference (that's why the key we added above
// will appear in the end result), so make a copy of it here
$copy = $product;
// Do step 2 above.
foreach($values as $item) {
$copy[$key] = $item;
$append[] = $copy;
}
// Undo the side effecst of array_shift
array_unshift($values, $product[$key]);
}
// Out of the foreach, we can add to $results now
$result = array_merge($result, $append);
}
}
return $result;
}
$input = array(
'arm' => array('A', 'B', 'C'),
'gender' => array('Female', 'Male'),
'location' => array('Vancouver', 'Calgary'),
);
print_r(cartesian($input));
One algorithm is to expand at each step the previous results with the current step items:
function cartezian1($inputArray)
{
$results = [];
foreach ($inputArray as $group) {
$results = expandItems($results, $group);
}
return $results;
}
function expandItems($sourceItems, $tails)
{
$result = [];
if (empty($sourceItems)) {
foreach ($tails as $tail) {
$result[] = [$tail];
}
return $result;
}
foreach ($sourceItems as $sourceItem) {
foreach ($tails as $tail) {
$result[] = array_merge($sourceItem, [$tail]);
}
}
return $result;
}
This solution uses memory to store the all combinations then returns them all at once. So, it's fast but it needs a lot of memory. Also, recursive functions are not used.
Why not use a recursive generator ... memory issues: close to none
(and it´s beautiful)
function cartesian($a)
{
if ($a)
{
if($u=array_pop($a))
foreach(cartesian($a)as$p)
foreach($u as$v)
yield $p+[count($p)=>$v];
}
else
yield[];
}
note: this does not preserve keys; but it´s a start.
This should do (not tested):
function acartesian($a)
{
if ($a)
{
$k=end(array_keys($a));
if($u=array_pop($a))
foreach(acartesian($a)as$p)
foreach($u as$v)
yield $p+[$k=>$v];
}
else
yield[];
}