N choose N/2 sublists of a list

前端 未结 5 796
抹茶落季
抹茶落季 2021-01-14 20:27

Is there an efficient way in Python to get all partitions of a list of size n into two subsets of size n/2? I want to get some iterative construct

相关标签:
5条回答
  • 2021-01-14 20:38

    Here's an itertools-based generator that I think yields exactly the values you want.

    def sub_lists(sequence):
        all_but_first = set(sequence[1:])
        for item in itertools.combinations(sequence[1:], len(sequence)//2 - 1):
            yield [[sequence[0]] + list(item), list(all_but_first.difference(item))]
    

    I avoid near-duplicate outputs in two ways as compared to a permutations based approach in Suever's answer. First, I avoid yielding both [["a", "b"], ["c", "d"]] and [["c", "d"], ["a", "b"]] by forcing all the results to have the first value of the input sequence in the first sublist. I avoid yielding [["a", "b"], ["c", "d"]] and [["a", "b"], ["d", "c"]] by building the second sublist using set-subtraction.

    Note that yielding nested tuples might be a little more natural than nested lists. To do that, just change the last line to:

    yield (sequence[0],) + item, tuple(all_but_first.difference(item))
    
    0 讨论(0)
  • 2021-01-14 20:44

    You will want to use itertools.combinations to do this. The inputs are the list you want to select items out of and the second is the number of items to select.

    result = [list(item) for item in itertools.combinations(input, len(input) // 2)]
    

    For an input of [1,2,3,4] this yields

    [[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]
    

    As @ShadowRanger pointed out, if order matters in your lists and you want all permutations, you'll want to substitute itertools.permutations into the solution.

    result = [list(item) for item in itertools.permutations(input, len(input) // 2)]
    # [[1, 2], [1, 3], [1, 4], [2, 1], [2, 3], [2, 4], [3, 1], [3, 2], [3, 4], [4, 1], [4, 2], [4, 3]]
    

    Edit

    Upon reading your question closer it is unclear if you want all n/2 permutations like I have shown or you want a list of lits where each element is yet another list of the two "halves" of the permutation.

    To accomplish this, you could do the following (incorporating some indexing help from @Blckknght)

    result = [[list(item[::2]), list(item[1::2])] for item in itertools.permutations(input)]
    

    In this case, the output of [1,2,3,4] would be

    [[[1, 3], [2, 4]], [[1, 4], [2, 3]], [[1, 2], [3, 4]], [[1, 4], [3, 2]], [[1, 2], [4, 3]], [[1, 3], [4, 2]], [[2, 3], [1, 4]], [[2, 4], [1, 3]], [[2, 1], [3, 4]], [[2, 4], [3, 1]], [[2, 1], [4, 3]], [[2, 3], [4, 1]], [[3, 2], [1, 4]], [[3, 4], [1, 2]], [[3, 1], [2, 4]], [[3, 4], [2, 1]], [[3, 1], [4, 2]], [[3, 2], [4, 1]], [[4, 2], [1, 3]], [[4, 3], [1, 2]], [[4, 1], [2, 3]], [[4, 3], [2, 1]], [[4, 1], [3, 2]], [[4, 2], [3, 1]]]
    

    Edit2

    Since order doesn't matter but you want an approach similar to the last one (lists of lists of lists), that's a little tricky with the last approach because of the array slicing. One alternative is to use set and frozenset to construct the initial information (rather than lists), because in a set the ordering doesn't matter when checking for equality. This will automatically allow us to remove duplicates. We can then add an extra step to convert back to a list if that's what you prefer.

    from itertools import permutations
    tmp = set([frozenset([frozenset(k[::2]),frozenset(k[1::2])]) for k in permutations(input)]) 
    result = [[list(el) for el in item] for item in tmp];
    

    This will yield

    [[[1, 2], [3, 4]], [[2, 3], [1, 4]], [[1, 3], [2, 4]]]
    
    0 讨论(0)
  • 2021-01-14 20:45

    Since none of the orders matter, but we're making a list of lists of lists (where order inherently matters), we can assume some invariants: in all pairs, the first element in the first pair is 1, and both lists in a pair are in sorted order.

    A = [1,2,3,4,5,6]
    
    from itertools import combinations
    first, rest = A[0], A[1:]
    result = [
                [
                    list((first,) + X), 
                    [x for x in rest if x not in X]
                ] 
                for X in combinations(rest, len(A)/2 - 1)
             ]
    
    0 讨论(0)
  • 2021-01-14 20:46

    Here's some code that performs timeit tests on the various solutions to this problem.

    To make the comparisons fair, I've commented out the argument-checking test in my functions.

    I've also added a function parts_combo_set that combines features of my combinations-based answer with that of Blckknght. It appears to be the fastest, except for very small lists.

    #!/usr/bin/env python
    
    ''' Generate all pairs of equal-sized partitions of a list of even length
    
        Testing the speed of the code at
        http://stackoverflow.com/q/36025609/4014959
    
        Written by PM 2Ring 2016.03.16
    '''
    
    from __future__ import print_function, division
    from itertools import combinations
    from timeit import Timer
    
    def parts_combo(lst):
        n = len(lst)
        #if n % 2 != 0:
            #raise ValueError('list length MUST be even')
    
        first = lst[0]
        for left in combinations(lst, n // 2):
            if left[0] != first:
                break
            right = [u for u in lst if u not in left]
            yield [list(left), right]
    
    def parts_combo_set(lst):
        n = len(lst)
        #if n % 2 != 0:
            #raise ValueError('list length MUST be even')
    
        first = lst[0]
        allset = set(lst)
        for left in combinations(lst, n // 2):
            if left[0] != first:
                break
            yield [list(left), list(allset.difference(left))]
    
    def parts_gosper(lst):
        n = len(lst)
        #if n % 2 != 0:
            #raise ValueError('list length MUST be even')
    
        lim = 1 << (n - 1)
        bits = (1 << n // 2) - 1
    
        while bits < lim:
            #Use bits to partition lst
            ss = [[], []]
            for i, u in enumerate(lst):
                ss[bits & (1<<i) == 0].append(u)
            yield ss
    
            #Calculate next bits permutation via Gosper's hack (HAKMEM #175)
            u = bits & (-bits)
            v = bits + u
            bits = v | (((v ^ bits) // u) >> 2)
    
    def sub_lists(sequence):
        all_but_first = set(sequence[1:])
        for item in combinations(sequence[1:], len(sequence)//2 - 1):
            yield [[sequence[0]] + list(item), list(all_but_first.difference(item))]
    
    def amit(seq):
        first, rest = seq[0], seq[1:]
        return [
                    [
                        list((first,) + t), 
                        [x for x in rest if x not in t]
                    ]
                    for t in combinations(rest, len(seq) // 2 - 1)
               ]
    
    funcs = (
        parts_combo,
        parts_combo_set,
        parts_gosper,
        sub_lists,
        amit,
    )
    
    def rset(seq):
        fset = frozenset
        return fset([fset([fset(u),fset(v)]) for u,v in seq])
    
    def validate():
        func = funcs[0]
        master = rset(func(lst))
        print('\nValidating against', func.func_name)
        for func in funcs[1:]:
            print(func.func_name, rset(func(lst)) == master)
    
    def time_test(loops, reps):
        ''' Print timing stats for all the functions '''
        for func in funcs:
            fname = func.func_name
            print('\n%s' % fname)
            setup = 'from __main__ import lst,' + fname
            #cmd = 'list(%s(lst))' % fname
            cmd = 'for t in %s(lst):pass' % fname
            t = Timer(cmd, setup)
            r = t.repeat(reps, loops)
            r.sort()
            print(r)
    
    
    num = 6
    lst = list(range(1, num + 1))
    print('num =', num)
    
    #parts = funcs[0]
    #for i, t in enumerate(parts(lst), 1):
        #print('{0:2d}: {1}'.format(i, t))
    
    validate()
    time_test(10000, 3)
    

    outputs

    time_test(10000, 3)

    num = 6
    
    Validating against parts_combo
    parts_combo_set True
    parts_gosper True
    sub_lists True
    amit True
    
    parts_combo
    [0.58100390434265137, 0.58798313140869141, 0.59674692153930664]
    
    parts_combo_set
    [0.74442911148071289, 0.77211689949035645, 0.79338312149047852]
    
    parts_gosper
    [1.0791628360748291, 1.089813232421875, 1.1191768646240234]
    
    sub_lists
    [0.77199792861938477, 0.79007697105407715, 0.81944608688354492]
    
    amit
    [0.60080099105834961, 0.60345196723937988, 0.60417318344116211]
    

    time_test(1000, 3)

    num = 8
    
    Validating against parts_combo
    parts_combo_set True
    parts_gosper True
    sub_lists True
    amit True
    
    parts_combo
    [0.22465801239013672, 0.22501206398010254, 0.23627114295959473]
    
    parts_combo_set
    [0.29469203948974609, 0.29857206344604492, 0.30069589614868164]
    
    parts_gosper
    [0.43568992614746094, 0.44395184516906738, 0.44651198387145996]
    
    sub_lists
    [0.31375885009765625, 0.32754802703857422, 0.37077498435974121]
    
    amit
    [0.22520613670349121, 0.22674393653869629, 0.24075913429260254]
    

    time_test(500, 3)

    num = 10
    
    parts_combo
    [0.52618098258972168, 0.52645611763000488, 0.53866386413574219]
    
    parts_combo_set
    [0.40614008903503418, 0.41370606422424316, 0.41525506973266602]
    
    parts_gosper
    [1.0068988800048828, 1.026188850402832, 1.1649439334869385]
    
    sub_lists
    [0.48507213592529297, 0.50991582870483398, 0.51528096199035645]
    
    amit
    [0.48686790466308594, 0.52898812294006348, 0.68387198448181152]
    

    time_test(100, 3)

    num = 12
    
    parts_combo
    [0.47471189498901367, 0.47522807121276855, 0.4798729419708252]
    
    parts_combo_set
    [0.3045799732208252, 0.30534601211547852, 0.35700607299804688]
    
    parts_gosper
    [0.83456206321716309, 0.83824801445007324, 0.87273812294006348]
    
    sub_lists
    [0.36697721481323242, 0.36919784545898438, 0.38349604606628418]
    
    amit
    [0.40012097358703613, 0.40033888816833496, 0.40788102149963379]
    

    time_test(50, 3)

    num = 14
    
    parts_combo
    [0.97016000747680664, 0.97931098937988281, 1.2653739452362061]
    
    parts_combo_set
    [0.81669902801513672, 0.88839983940124512, 0.91469597816467285]
    
    parts_gosper
    [1.772817850112915, 1.9343690872192383, 1.9586498737335205]
    
    sub_lists
    [0.78162002563476562, 0.79451298713684082, 0.8126368522644043]
    
    amit
    [0.89046502113342285, 0.89572596549987793, 0.91031289100646973]
    

    time_test(50, 3)

    num = 16
    
    parts_combo
    [4.1981601715087891, 4.3565289974212646, 4.3795731067657471]
    
    parts_combo_set
    [2.5452880859375, 2.5757780075073242, 2.6059379577636719]
    
    parts_gosper
    [7.5856668949127197, 7.6066100597381592, 7.6397140026092529]
    
    sub_lists
    [3.677016019821167, 3.6800520420074463, 3.7420001029968262]
    
    amit
    [4.1738030910491943, 4.1768841743469238, 4.1960680484771729]
    

    time_test(10, 3)

    num = 18
    
    parts_combo
    [3.8362669944763184, 3.8807728290557861, 4.0259079933166504]
    
    parts_combo_set
    [1.9355819225311279, 1.9540839195251465, 1.9573280811309814]
    
    parts_gosper
    [6.3178229331970215, 6.6125278472900391, 7.0462160110473633]
    
    sub_lists
    [2.1632919311523438, 2.238231897354126, 2.2747220993041992]
    
    amit
    [3.6137850284576416, 3.6162960529327393, 3.6475629806518555]
    

    time_test(10, 3)

    num = 20
    
    parts_combo
    [16.874133110046387, 17.585763931274414, 19.725590944290161]
    
    parts_combo_set
    [7.5462148189544678, 7.5597691535949707, 7.8375740051269531]
    
    parts_gosper
    [27.312526941299438, 27.637516021728516, 28.016690015792847]
    
    sub_lists
    [7.7865769863128662, 7.8874318599700928, 8.5498230457305908]
    
    amit
    [15.554526805877686, 15.626868009567261, 16.224159002304077]
    

    These tests were performed on a 2GHz single-core machine with 2 GB of RAM running Python 2.6.6.

    0 讨论(0)
  • 2021-01-14 20:53

    Here's a solution which doesn't use itertools. It uses a trick called Gosper's hack to generate bit permutations. See HAKMEM Item 175 for an explanation of how it works; this hack is also mentioned in the Wikipedia article Combinatorial number system. And it features in the accepted answer to this SO question: Iterating over all subsets of a given size.

    The parts function is a generator, so you can use it in a for loop, as illustrated in my test.

    How it works.

    To partiton a list of length n into pairs of sublists of length n/2 we use a binary number bits consisting of n/2 zero bits and n/2 one bits. A zero bit in a given position indicates that the corresponding list element goes into the left sublist, a one bit in a given position indicates that the corresponding list element goes into the right sublist.

    Initially, bits is set to 2 ** (n/2) - 1, so if n = 6, bits starts out as 000111.

    The generator uses Gosper's hack to permute bits in numerical order, stopping when we get a one bit in the highest position, since that's when we start getting the reversed versions of our sublist pairs.

    The code responsible for converting the pattern in bit into the pair of sublists is:

        for i, u in enumerate(lst):
            ss[bits & (1<<i) == 0].append(u)
    

    If there's a zero at bit position i in bits then ss[0] gets the current item from lst, otherwise it's appended to ss[1].

    This code runs on Python 2 and Python 3.

    from __future__ import print_function
    
    def parts(lst):
        ''' Generate all pairs of equal-sized partitions of a list of even length '''
    
        n = len(lst)
        if n % 2 != 0:
            raise ValueError('list length MUST be even')
    
        lim = 1 << (n - 1)
        bits = (1 << n // 2) - 1
    
        while bits < lim:
            #Use bits to partition lst
            ss = [[], []]
            for i, u in enumerate(lst):
                ss[bits & (1<<i) == 0].append(u)
            yield ss
    
            #Calculate next bits permutation via Gosper's hack (HAKMEM #175)
            u = bits & (-bits)
            v = bits + u
            bits = v | (((v ^ bits) // u) >> 2)
    
    # Test
    lst = list(range(1, 7))
    for i, t in enumerate(parts(lst), 1):
        print('{0:2d}: {1}'.format(i, t))    
    

    output

     1: [[1, 2, 3], [4, 5, 6]]
     2: [[1, 2, 4], [3, 5, 6]]
     3: [[1, 3, 4], [2, 5, 6]]
     4: [[2, 3, 4], [1, 5, 6]]
     5: [[1, 2, 5], [3, 4, 6]]
     6: [[1, 3, 5], [2, 4, 6]]
     7: [[2, 3, 5], [1, 4, 6]]
     8: [[1, 4, 5], [2, 3, 6]]
     9: [[2, 4, 5], [1, 3, 6]]
    10: [[3, 4, 5], [1, 2, 6]]
    

    I admit that using something inscrutable like Gosper's hack isn't exactly Pythonic. :)


    Here's how you capture the output of parts into a list of all the sublists. It also illustrates that parts can handle string input, although it produces the output as lists of strings.

    seq = list(parts('abcd'))
    print(seq)
    

    output

    [[['a', 'b'], ['c', 'd']], [['a', 'c'], ['b', 'd']], [['b', 'c'], ['a', 'd']]] 
    


    Here's another solution, using itertools to generate the combinations. It generates the pairs in a different order to the earlier version. However, it's shorter and easier to read. More importantly, it's significantly faster, between 50 to 100 percent faster in my timeit tests, depending on the list length; the difference appears to get smaller for longer lists.

    def parts(lst):
        n = len(lst)
        if n % 2 != 0:
            raise ValueError('list length MUST be even')
    
        first = lst[0]
        for left in combinations(lst, n // 2):
            if left[0] != first:
                break
            right = [u for u in lst if u not in left]
            yield [list(left), right]
    
    # Test
    lst = list(range(1, 7))
    for i, t in enumerate(parts(lst), 1):
        print('{0:2d}: {1}'.format(i, t))    
    

    output

     1: [[1, 2, 3], [4, 5, 6]]
     2: [[1, 2, 4], [3, 5, 6]]
     3: [[1, 2, 5], [3, 4, 6]]
     4: [[1, 2, 6], [3, 4, 5]]
     5: [[1, 3, 4], [2, 5, 6]]
     6: [[1, 3, 5], [2, 4, 6]]
     7: [[1, 3, 6], [2, 4, 5]]
     8: [[1, 4, 5], [2, 3, 6]]
     9: [[1, 4, 6], [2, 3, 5]]
    
    0 讨论(0)
提交回复
热议问题