Just a style question: Is there a build-in method to get the combinations under the assertion of commutative property and excluding elements paired with itself?
you mean this?
a = ["1", "2", "3"]
b = ["1", "2", "3"]
print [(x,y) for x in a for y in b]
output:
[('1', '1'), ('1', '2'), ('1', '3'), ('2', '1'), ('2', '2'), ('2', '3'), ('3', '1'), ('3', '2'), ('3', '3')]
itertools.combinations, if both lists are the same like here. Or in the general case itertools.product, followed by some filtering:
In [7]: a = ["1", "2", "3"]
...: b = ["a", "b", "c"]
In [8]: list(filter(lambda t: t[0] < t[1], product(a,b)))
Out[8]:
[('1', 'a'),
('1', 'b'),
('1', 'c'),
('2', 'a'),
('2', 'b'),
('2', 'c'),
('3', 'a'),
('3', 'b'),
('3', 'c')]
Also, I think the term combination already means that the order of elements in the result doesn't matter.
Ok, Theodros is right. For compensation, here's a version which should work on a any list of lists:
l = [['1','2','3'], ['a','b'], ['x','y']]
set(tuple(sorted(p)) for p in product(*l) if len(set(p)) > 1)
gives (appropriately sorted)
set([('1', 'a', 'x'),
('3', 'a', 'y'),
('2', 'b', 'y'),
('2', 'a', 'y'),
('1', 'a', 'y'),
('1', 'b', 'y'),
('2', 'a', 'x'),
('3', 'b', 'y'),
('1', 'b', 'x'),
('2', 'b', 'x'),
('3', 'a', 'x'),
('3', 'b', 'x')])
And it also works on the previous counterexample l = [[1,2,3], [1,3,4,5]]
:
set([(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 5), (3, 4), (2, 4), (3, 5)])
This will work if you do not care that the ordering in the resulting tuples does not map to the input lists (you do not care whether (1,2)
or (2,1)
). Here you'll get the combination with the smaller element first:
a = [1,2,3]
b = [1,3,4,5]
set([(min(x,y), max(x,y)) for x in a for y in b if x != y])
gives
set([(1, 2),
(1, 3),
(1, 4),
(1, 5),
(2, 3),
(2, 5),
(3, 4),
(2, 4),
(3, 5)])
With strings
a = '1 2 3'.split()
b = '1 3 4 5'.split()
you get
set([('2', '3'),
('3', '5'),
('1', '4'),
('3', '4'),
('1', '5'),
('1', '2'),
('2', '5'),
('1', '3'),
('2', '4')])
The apparent difference in the ordering comes from the different hashes for the strings and the numbers.
Probably the simplest and most explicit (i.e. self-explanatory) solution is this I think:
>>> for i in range(len(a)):
... for j in range(i+1, len(b)):
... print(a[i], a[j])
...
1 2
1 3
2 3
Unless you're doing something very fast in the inner loop, the inefficiency of the python for loops will hardly be a bottleneck.
Or, this:
>>> [(a[i], b[j]) for i in range(len(a)) for j in range(i+1, len(b))]
[('1', '2'), ('1', '3'), ('2', '3')]
Assuming a
and b
are identical.
>>> import itertools
>>> a = ["1", "2", "3"]
>>> list(itertools.combinations(a,2))
[('1', '2'), ('1', '3'), ('2', '3')]
I don't think this is the most self-explanatory way to do it so I wouldn't recommend it but I'll include it for completeness.
Uses the fact that the commutative pairs are the upper and lower triangles of the matrix produced by the product of both arrays.
The numpy function np.tril_indices returns a tuple containing the indices for only the lower-triangle of an array.
>>> import numpy as np
>>> n_vars = len(a)
>>> assert len(b) == n_vars # Only works if len(a) == len(b)
>>> [(a[i], b[j]) for i, j in zip(*np.tril_indices(n_vars))]
[('1', '1'), ('2', '1'), ('2', '2'), ('3', '1'), ('3', '2'), ('3', '3')]
>>> [(a[i], b[j]) for i, j in zip(*np.tril_indices(n_vars, k=-1))]
[('2', '1'), ('3', '1'), ('3', '2')]
The k
argument in np.tril_indices
is an offset parameter so k=-1
means it doesn't include the diagonal terms.
Since it's a numpy function it's probably very fast.
a
and b
to numpy arrays you can also do this:>>> a = np.array(a)
>>> b = np.array(b)
>>> ind = np.tril_indices(n_vars, k=-1)
>>> list(zip(a[ind[0]], b[ind[1]]))
[('2', '1'), ('3', '1'), ('3', '2')]
>>> np.stack([a[ind[0]], b[ind[1]]]).T
array([['2', '1'],
['3', '1'],
['3', '2']], dtype='<U1')