问题
Is there an efficient way in Python to get all partitions of a list of size n
into two subsets of size n/2
? I want to get some iterative construct such that each iteration provides two non-overlapping subsets of the original list, each subset having size n/2
.
For example:
A = [1,2,3,4,5,6] # here n = 6
# some iterative construct
# in each iteration, a pair of subsets of size n/2
# subsets = [[1,3,4], [2,5,6]] for example for one of the iterations
# subsets = [[1,2,5],[3,4,6]] a different iteration example
The subsets should be non-overlapping, e.g. [[1,2,3], [4,5,6]]
is valid but [[1,2,3], [3,4,5]]
is not. The order of the two subsets does not matter, e.g. [[1,2,3], [4,5,6]]
does not count as different from [[4,5,6], [1,2,3]]
and thus only one of those two should appear in an iteration. The order within each subset also does not matter, so [[1,2,3], [4,5,6]]
, [[1,3,2], [4,5,6]]
, [[3,2,1], [6,5,4]]
, etc. all count as the same and so only one of them should show up in whole iteration.
回答1:
Here's an itertools
-based generator that I think yields exactly the values you want.
def sub_lists(sequence):
all_but_first = set(sequence[1:])
for item in itertools.combinations(sequence[1:], len(sequence)//2 - 1):
yield [[sequence[0]] + list(item), list(all_but_first.difference(item))]
I avoid near-duplicate outputs in two ways as compared to a permutations
based approach in Suever's answer. First, I avoid yielding both [["a", "b"], ["c", "d"]]
and [["c", "d"], ["a", "b"]]
by forcing all the results to have the first value of the input sequence in the first sublist. I avoid yielding [["a", "b"], ["c", "d"]]
and [["a", "b"], ["d", "c"]]
by building the second sublist using set-subtraction.
Note that yielding nested tuples might be a little more natural than nested lists. To do that, just change the last line to:
yield (sequence[0],) + item, tuple(all_but_first.difference(item))
回答2:
You will want to use itertools.combinations to do this. The inputs are the list you want to select items out of and the second is the number of items to select.
result = [list(item) for item in itertools.combinations(input, len(input) // 2)]
For an input of [1,2,3,4]
this yields
[[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]
As @ShadowRanger pointed out, if order matters in your lists and you want all permutations, you'll want to substitute itertools.permutations
into the solution.
result = [list(item) for item in itertools.permutations(input, len(input) // 2)]
# [[1, 2], [1, 3], [1, 4], [2, 1], [2, 3], [2, 4], [3, 1], [3, 2], [3, 4], [4, 1], [4, 2], [4, 3]]
Edit
Upon reading your question closer it is unclear if you want all n/2
permutations like I have shown or you want a list of lits where each element is yet another list of the two "halves" of the permutation.
To accomplish this, you could do the following (incorporating some indexing help from @Blckknght)
result = [[list(item[::2]), list(item[1::2])] for item in itertools.permutations(input)]
In this case, the output of [1,2,3,4]
would be
[[[1, 3], [2, 4]], [[1, 4], [2, 3]], [[1, 2], [3, 4]], [[1, 4], [3, 2]], [[1, 2], [4, 3]], [[1, 3], [4, 2]], [[2, 3], [1, 4]], [[2, 4], [1, 3]], [[2, 1], [3, 4]], [[2, 4], [3, 1]], [[2, 1], [4, 3]], [[2, 3], [4, 1]], [[3, 2], [1, 4]], [[3, 4], [1, 2]], [[3, 1], [2, 4]], [[3, 4], [2, 1]], [[3, 1], [4, 2]], [[3, 2], [4, 1]], [[4, 2], [1, 3]], [[4, 3], [1, 2]], [[4, 1], [2, 3]], [[4, 3], [2, 1]], [[4, 1], [3, 2]], [[4, 2], [3, 1]]]
Edit2
Since order doesn't matter but you want an approach similar to the last one (lists of lists of lists), that's a little tricky with the last approach because of the array slicing. One alternative is to use set
and frozenset to construct the initial information (rather than lists), because in a set
the ordering doesn't matter when checking for equality. This will automatically allow us to remove duplicates. We can then add an extra step to convert back to a list if that's what you prefer.
from itertools import permutations
tmp = set([frozenset([frozenset(k[::2]),frozenset(k[1::2])]) for k in permutations(input)])
result = [[list(el) for el in item] for item in tmp];
This will yield
[[[1, 2], [3, 4]], [[2, 3], [1, 4]], [[1, 3], [2, 4]]]
回答3:
Here's a solution which doesn't use itertools. It uses a trick called Gosper's hack to generate bit permutations. See HAKMEM Item 175 for an explanation of how it works; this hack is also mentioned in the Wikipedia article Combinatorial number system. And it features in the accepted answer to this SO question: Iterating over all subsets of a given size.
The parts
function is a generator, so you can use it in a for
loop, as illustrated in my test.
How it works.
To partiton a list of length n into pairs of sublists of length n/2 we use a binary number bits
consisting of n/2 zero bits and n/2 one bits. A zero bit in a given position indicates that the corresponding list element goes into the left sublist, a one bit in a given position indicates that the corresponding list element goes into the right sublist.
Initially, bits
is set to 2 ** (n/2) - 1
, so if n = 6, bits
starts out as 000111
.
The generator uses Gosper's hack to permute bits
in numerical order, stopping when we get a one bit in the highest position, since that's when we start getting the reversed versions of our sublist pairs.
The code responsible for converting the pattern in bit
into the pair of sublists is:
for i, u in enumerate(lst):
ss[bits & (1<<i) == 0].append(u)
If there's a zero at bit position i
in bits
then ss[0]
gets the current item from lst
, otherwise it's appended to ss[1]
.
This code runs on Python 2 and Python 3.
from __future__ import print_function
def parts(lst):
''' Generate all pairs of equal-sized partitions of a list of even length '''
n = len(lst)
if n % 2 != 0:
raise ValueError('list length MUST be even')
lim = 1 << (n - 1)
bits = (1 << n // 2) - 1
while bits < lim:
#Use bits to partition lst
ss = [[], []]
for i, u in enumerate(lst):
ss[bits & (1<<i) == 0].append(u)
yield ss
#Calculate next bits permutation via Gosper's hack (HAKMEM #175)
u = bits & (-bits)
v = bits + u
bits = v | (((v ^ bits) // u) >> 2)
# Test
lst = list(range(1, 7))
for i, t in enumerate(parts(lst), 1):
print('{0:2d}: {1}'.format(i, t))
output
1: [[1, 2, 3], [4, 5, 6]]
2: [[1, 2, 4], [3, 5, 6]]
3: [[1, 3, 4], [2, 5, 6]]
4: [[2, 3, 4], [1, 5, 6]]
5: [[1, 2, 5], [3, 4, 6]]
6: [[1, 3, 5], [2, 4, 6]]
7: [[2, 3, 5], [1, 4, 6]]
8: [[1, 4, 5], [2, 3, 6]]
9: [[2, 4, 5], [1, 3, 6]]
10: [[3, 4, 5], [1, 2, 6]]
I admit that using something inscrutable like Gosper's hack isn't exactly Pythonic. :)
Here's how you capture the output of parts
into a list of all the sublists. It also illustrates that parts
can handle string input, although it produces the output as lists of strings.
seq = list(parts('abcd'))
print(seq)
output
[[['a', 'b'], ['c', 'd']], [['a', 'c'], ['b', 'd']], [['b', 'c'], ['a', 'd']]]
Here's another solution, using itertools to generate the combinations. It generates the pairs in a different order to the earlier version. However, it's shorter and easier to read. More importantly, it's significantly faster, between 50 to 100 percent faster in my timeit
tests, depending on the list length; the difference appears to get smaller for longer lists.
def parts(lst):
n = len(lst)
if n % 2 != 0:
raise ValueError('list length MUST be even')
first = lst[0]
for left in combinations(lst, n // 2):
if left[0] != first:
break
right = [u for u in lst if u not in left]
yield [list(left), right]
# Test
lst = list(range(1, 7))
for i, t in enumerate(parts(lst), 1):
print('{0:2d}: {1}'.format(i, t))
output
1: [[1, 2, 3], [4, 5, 6]]
2: [[1, 2, 4], [3, 5, 6]]
3: [[1, 2, 5], [3, 4, 6]]
4: [[1, 2, 6], [3, 4, 5]]
5: [[1, 3, 4], [2, 5, 6]]
6: [[1, 3, 5], [2, 4, 6]]
7: [[1, 3, 6], [2, 4, 5]]
8: [[1, 4, 5], [2, 3, 6]]
9: [[1, 4, 6], [2, 3, 5]]
回答4:
Since none of the orders matter, but we're making a list of lists of lists (where order inherently matters), we can assume some invariants: in all pairs, the first element in the first pair is 1, and both lists in a pair are in sorted order.
A = [1,2,3,4,5,6]
from itertools import combinations
first, rest = A[0], A[1:]
result = [
[
list((first,) + X),
[x for x in rest if x not in X]
]
for X in combinations(rest, len(A)/2 - 1)
]
回答5:
Here's some code that performs timeit tests on the various solutions to this problem.
To make the comparisons fair, I've commented out the argument-checking test in my functions.
I've also added a function parts_combo_set
that combines features of my combinations-based answer with that of Blckknght. It appears to be the fastest, except for very small lists.
#!/usr/bin/env python
''' Generate all pairs of equal-sized partitions of a list of even length
Testing the speed of the code at
http://stackoverflow.com/q/36025609/4014959
Written by PM 2Ring 2016.03.16
'''
from __future__ import print_function, division
from itertools import combinations
from timeit import Timer
def parts_combo(lst):
n = len(lst)
#if n % 2 != 0:
#raise ValueError('list length MUST be even')
first = lst[0]
for left in combinations(lst, n // 2):
if left[0] != first:
break
right = [u for u in lst if u not in left]
yield [list(left), right]
def parts_combo_set(lst):
n = len(lst)
#if n % 2 != 0:
#raise ValueError('list length MUST be even')
first = lst[0]
allset = set(lst)
for left in combinations(lst, n // 2):
if left[0] != first:
break
yield [list(left), list(allset.difference(left))]
def parts_gosper(lst):
n = len(lst)
#if n % 2 != 0:
#raise ValueError('list length MUST be even')
lim = 1 << (n - 1)
bits = (1 << n // 2) - 1
while bits < lim:
#Use bits to partition lst
ss = [[], []]
for i, u in enumerate(lst):
ss[bits & (1<<i) == 0].append(u)
yield ss
#Calculate next bits permutation via Gosper's hack (HAKMEM #175)
u = bits & (-bits)
v = bits + u
bits = v | (((v ^ bits) // u) >> 2)
def sub_lists(sequence):
all_but_first = set(sequence[1:])
for item in combinations(sequence[1:], len(sequence)//2 - 1):
yield [[sequence[0]] + list(item), list(all_but_first.difference(item))]
def amit(seq):
first, rest = seq[0], seq[1:]
return [
[
list((first,) + t),
[x for x in rest if x not in t]
]
for t in combinations(rest, len(seq) // 2 - 1)
]
funcs = (
parts_combo,
parts_combo_set,
parts_gosper,
sub_lists,
amit,
)
def rset(seq):
fset = frozenset
return fset([fset([fset(u),fset(v)]) for u,v in seq])
def validate():
func = funcs[0]
master = rset(func(lst))
print('\nValidating against', func.func_name)
for func in funcs[1:]:
print(func.func_name, rset(func(lst)) == master)
def time_test(loops, reps):
''' Print timing stats for all the functions '''
for func in funcs:
fname = func.func_name
print('\n%s' % fname)
setup = 'from __main__ import lst,' + fname
#cmd = 'list(%s(lst))' % fname
cmd = 'for t in %s(lst):pass' % fname
t = Timer(cmd, setup)
r = t.repeat(reps, loops)
r.sort()
print(r)
num = 6
lst = list(range(1, num + 1))
print('num =', num)
#parts = funcs[0]
#for i, t in enumerate(parts(lst), 1):
#print('{0:2d}: {1}'.format(i, t))
validate()
time_test(10000, 3)
outputs
time_test(10000, 3)
num = 6
Validating against parts_combo
parts_combo_set True
parts_gosper True
sub_lists True
amit True
parts_combo
[0.58100390434265137, 0.58798313140869141, 0.59674692153930664]
parts_combo_set
[0.74442911148071289, 0.77211689949035645, 0.79338312149047852]
parts_gosper
[1.0791628360748291, 1.089813232421875, 1.1191768646240234]
sub_lists
[0.77199792861938477, 0.79007697105407715, 0.81944608688354492]
amit
[0.60080099105834961, 0.60345196723937988, 0.60417318344116211]
time_test(1000, 3)
num = 8
Validating against parts_combo
parts_combo_set True
parts_gosper True
sub_lists True
amit True
parts_combo
[0.22465801239013672, 0.22501206398010254, 0.23627114295959473]
parts_combo_set
[0.29469203948974609, 0.29857206344604492, 0.30069589614868164]
parts_gosper
[0.43568992614746094, 0.44395184516906738, 0.44651198387145996]
sub_lists
[0.31375885009765625, 0.32754802703857422, 0.37077498435974121]
amit
[0.22520613670349121, 0.22674393653869629, 0.24075913429260254]
time_test(500, 3)
num = 10
parts_combo
[0.52618098258972168, 0.52645611763000488, 0.53866386413574219]
parts_combo_set
[0.40614008903503418, 0.41370606422424316, 0.41525506973266602]
parts_gosper
[1.0068988800048828, 1.026188850402832, 1.1649439334869385]
sub_lists
[0.48507213592529297, 0.50991582870483398, 0.51528096199035645]
amit
[0.48686790466308594, 0.52898812294006348, 0.68387198448181152]
time_test(100, 3)
num = 12
parts_combo
[0.47471189498901367, 0.47522807121276855, 0.4798729419708252]
parts_combo_set
[0.3045799732208252, 0.30534601211547852, 0.35700607299804688]
parts_gosper
[0.83456206321716309, 0.83824801445007324, 0.87273812294006348]
sub_lists
[0.36697721481323242, 0.36919784545898438, 0.38349604606628418]
amit
[0.40012097358703613, 0.40033888816833496, 0.40788102149963379]
time_test(50, 3)
num = 14
parts_combo
[0.97016000747680664, 0.97931098937988281, 1.2653739452362061]
parts_combo_set
[0.81669902801513672, 0.88839983940124512, 0.91469597816467285]
parts_gosper
[1.772817850112915, 1.9343690872192383, 1.9586498737335205]
sub_lists
[0.78162002563476562, 0.79451298713684082, 0.8126368522644043]
amit
[0.89046502113342285, 0.89572596549987793, 0.91031289100646973]
time_test(50, 3)
num = 16
parts_combo
[4.1981601715087891, 4.3565289974212646, 4.3795731067657471]
parts_combo_set
[2.5452880859375, 2.5757780075073242, 2.6059379577636719]
parts_gosper
[7.5856668949127197, 7.6066100597381592, 7.6397140026092529]
sub_lists
[3.677016019821167, 3.6800520420074463, 3.7420001029968262]
amit
[4.1738030910491943, 4.1768841743469238, 4.1960680484771729]
time_test(10, 3)
num = 18
parts_combo
[3.8362669944763184, 3.8807728290557861, 4.0259079933166504]
parts_combo_set
[1.9355819225311279, 1.9540839195251465, 1.9573280811309814]
parts_gosper
[6.3178229331970215, 6.6125278472900391, 7.0462160110473633]
sub_lists
[2.1632919311523438, 2.238231897354126, 2.2747220993041992]
amit
[3.6137850284576416, 3.6162960529327393, 3.6475629806518555]
time_test(10, 3)
num = 20
parts_combo
[16.874133110046387, 17.585763931274414, 19.725590944290161]
parts_combo_set
[7.5462148189544678, 7.5597691535949707, 7.8375740051269531]
parts_gosper
[27.312526941299438, 27.637516021728516, 28.016690015792847]
sub_lists
[7.7865769863128662, 7.8874318599700928, 8.5498230457305908]
amit
[15.554526805877686, 15.626868009567261, 16.224159002304077]
These tests were performed on a 2GHz single-core machine with 2 GB of RAM running Python 2.6.6.
来源:https://stackoverflow.com/questions/36025609/n-choose-n-2-sublists-of-a-list