itertools.permutations generates where its elements are treated as unique based on their position, not on their value. So basically I want to avoid duplicates like this:
This is my solution with 10 lines:
class Solution(object):
def permute_unique(self, nums):
perms = [[]]
for n in nums:
new_perm = []
for perm in perms:
for i in range(len(perm) + 1):
new_perm.append(perm[:i] + [n] + perm[i:])
# handle duplication
if i < len(perm) and perm[i] == n: break
perms = new_perm
return perms
if __name__ == '__main__':
s = Solution()
print s.permute_unique([1, 1, 1])
print s.permute_unique([1, 2, 1])
print s.permute_unique([1, 2, 3])
--- Result ----
[[1, 1, 1]]
[[1, 2, 1], [2, 1, 1], [1, 1, 2]]
[[3, 2, 1], [2, 3, 1], [2, 1, 3], [3, 1, 2], [1, 3, 2], [1, 2, 3]]
A naive approach might be to take the set of the permutations:
list(set(it.permutations([1, 1, 1])))
# [(1, 1, 1)]
However, this technique wastefully computes replicate permutations and discards them. A more efficient approach would be more_itertools.distinct_permutations, a third-party tool.
Code
import itertools as it
import more_itertools as mit
list(mit.distinct_permutations([1, 1, 1]))
# [(1, 1, 1)]
Performance
Using a larger iterable, we will compare the performances between the naive and third-party techniques.
iterable = [1, 1, 1, 1, 1, 1]
len(list(it.permutations(iterable)))
# 720
%timeit -n 10000 list(set(it.permutations(iterable)))
# 10000 loops, best of 3: 111 µs per loop
%timeit -n 10000 list(mit.distinct_permutations(iterable))
# 10000 loops, best of 3: 16.7 µs per loop
We see more_itertools.distinct_permutations
is an order of magnitude faster.
Details
From the source, a recursion algorithm (as seen in the accepted answer) is used to compute distinct permutations, thereby obviating wasteful computations. See the source code for more details.
Adapted to remove recursion, use a dictionary and numba for high performance but not using yield/generator style so memory usage is not limited:
import numba
@numba.njit
def perm_unique_fast(elements): #memory usage too high for large permutations
eset = set(elements)
dictunique = dict()
for i in eset: dictunique[i] = elements.count(i)
result_list = numba.typed.List()
u = len(elements)
for _ in range(u): result_list.append(0)
s = numba.typed.List()
results = numba.typed.List()
d = u
while True:
if d > 0:
for i in dictunique:
if dictunique[i] > 0: s.append((i, d - 1))
i, d = s.pop()
if d == -1:
dictunique[i] += 1
if len(s) == 0: break
continue
result_list[d] = i
if d == 0: results.append(result_list[:])
dictunique[i] -= 1
s.append((i, -1))
return results
import timeit
l = [2, 2, 3, 3, 4, 4, 5, 5, 6, 6]
%timeit list(perm_unique(l))
#377 ms ± 26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
ltyp = numba.typed.List()
for x in l: ltyp.append(x)
%timeit perm_unique_fast(ltyp)
#293 ms ± 3.37 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
assert list(sorted(perm_unique(l))) == list(sorted([tuple(x) for x in perm_unique_fast(ltyp)]))
About 30% faster but still suffers a bit due to list copying and management.
Alternatively without numba but still without recursion and using a generator to avoid memory issues:
def perm_unique_fast_gen(elements):
eset = set(elements)
dictunique = dict()
for i in eset: dictunique[i] = elements.count(i)
result_list = list() #numba.typed.List()
u = len(elements)
for _ in range(u): result_list.append(0)
s = list()
d = u
while True:
if d > 0:
for i in dictunique:
if dictunique[i] > 0: s.append((i, d - 1))
i, d = s.pop()
if d == -1:
dictunique[i] += 1
if len(s) == 0: break
continue
result_list[d] = i
if d == 0: yield result_list
dictunique[i] -= 1
s.append((i, -1))
This is my attempt without resorting to set / dict, as a generator using recursion, but using string as input. Output is also ordered in natural order:
def perm_helper(head: str, tail: str):
if len(tail) == 0:
yield head
else:
last_c = None
for index, c in enumerate(tail):
if last_c != c:
last_c = c
yield from perm_helper(
head + c, tail[:index] + tail[index + 1:]
)
def perm_generator(word):
yield from perm_helper("", sorted(word))
example:
from itertools import takewhile
word = "POOL"
list(takewhile(lambda w: w != word, (x for x in perm_generator(word))))
# output
# ['LOOP', 'LOPO', 'LPOO', 'OLOP', 'OLPO', 'OOLP', 'OOPL', 'OPLO', 'OPOL', 'PLOO', 'POLO']
The best solution to this problem I have seen uses Knuth's "Algorithm L" (as noted previously by Gerrat in the comments to the original post):
http://stackoverflow.com/questions/12836385/how-can-i-interleave-or-create-unique-permutations-of-two-stings-without-recurs/12837695
Some timings:
Sorting [1]*12+[0]*12
(2,704,156 unique permutations):
Algorithm L → 2.43 s
Luke Rahne's solution → 8.56 s
scipy.multiset_permutations()
→ 16.8 s
You could try using set:
>>> list(itertools.permutations(set([1,1,2,2])))
[(1, 2), (2, 1)]
The call to set removed duplicates