Python: split a list based on a condition?

前端 未结 30 1873
误落风尘
误落风尘 2020-11-22 06:56

What\'s the best way, both aesthetically and from a performance perspective, to split a list of items into multiple lists based on a conditional? The equivalent of:

相关标签:
30条回答
  • 2020-11-22 07:10

    If you want to make it in FP style:

    good, bad = [ sum(x, []) for x in zip(*(([y], []) if y in goodvals else ([], [y])
                                            for y in mylist)) ]
    

    Not the most readable solution, but at least iterates through mylist only once.

    0 讨论(0)
  • 2020-11-22 07:11

    If the list is made of groups and intermittent separators, you can use:

    def split(items, p):
        groups = [[]]
        for i in items:
            if p(i):
                groups.append([])
            groups[-1].append(i)
        return groups
    

    Usage:

    split(range(1,11), lambda x: x % 3 == 0)
    # gives [[1, 2], [3, 4, 5], [6, 7, 8], [9, 10]]
    
    0 讨论(0)
  • 2020-11-22 07:12

    Sometimes, it looks like list comprehension is not the best thing to use !

    I made a little test based on the answer people gave to this topic, tested on a random generated list. Here is the generation of the list (there's probably a better way to do, but it's not the point) :

    good_list = ('.jpg','.jpeg','.gif','.bmp','.png')
    
    import random
    import string
    my_origin_list = []
    for i in xrange(10000):
        fname = ''.join(random.choice(string.lowercase) for i in range(random.randrange(10)))
        if random.getrandbits(1):
            fext = random.choice(good_list)
        else:
            fext = "." + ''.join(random.choice(string.lowercase) for i in range(3))
    
        my_origin_list.append((fname + fext, random.randrange(1000), fext))
    

    And here we go

    # Parand
    def f1():
        return [e for e in my_origin_list if e[2] in good_list], [e for e in my_origin_list if not e[2] in good_list]
    
    # dbr
    def f2():
        a, b = list(), list()
        for e in my_origin_list:
            if e[2] in good_list:
                a.append(e)
            else:
                b.append(e)
        return a, b
    
    # John La Rooy
    def f3():
        a, b = list(), list()
        for e in my_origin_list:
            (b, a)[e[2] in good_list].append(e)
        return a, b
    
    # Ants Aasma
    def f4():
        l1, l2 = tee((e[2] in good_list, e) for e in my_origin_list)
        return [i for p, i in l1 if p], [i for p, i in l2 if not p]
    
    # My personal way to do
    def f5():
        a, b = zip(*[(e, None) if e[2] in good_list else (None, e) for e in my_origin_list])
        return list(filter(None, a)), list(filter(None, b))
    
    # BJ Homer
    def f6():
        return filter(lambda e: e[2] in good_list, my_origin_list), filter(lambda e: not e[2] in good_list, my_origin_list)
    

    Using the cmpthese function, the best result is the dbr answer :

    f1     204/s  --    -5%   -14%   -15%   -20%   -26%
    f6     215/s     6%  --    -9%   -11%   -16%   -22%
    f3     237/s    16%    10%  --    -2%    -7%   -14%
    f4     240/s    18%    12%     2%  --    -6%   -13%
    f5     255/s    25%    18%     8%     6%  --    -8%
    f2     277/s    36%    29%    17%    15%     9%  --
    
    0 讨论(0)
  • 2020-11-22 07:13

    Not sure if this is a good approach but it can be done in this way as well

    IMAGE_TYPES = ('.jpg','.jpeg','.gif','.bmp','.png')
    files = [ ('file1.jpg', 33L, '.jpg'), ('file2.avi', 999L, '.avi')]
    images, anims = reduce(lambda (i, a), f: (i + [f], a) if f[2] in IMAGE_TYPES else (i, a + [f]), files, ([], []))
    
    0 讨论(0)
  • 2020-11-22 07:14

    Yet another answer, short but "evil" (for list-comprehension side effects).

    digits = list(range(10))
    odd = [x.pop(i) for i, x in enumerate(digits) if x % 2]
    
    >>> odd
    [1, 3, 5, 7, 9]
    
    >>> digits
    [0, 2, 4, 6, 8]
    
    0 讨论(0)
  • 2020-11-22 07:15

    First go (pre-OP-edit): Use sets:

    mylist = [1,2,3,4,5,6,7]
    goodvals = [1,3,7,8,9]
    
    myset = set(mylist)
    goodset = set(goodvals)
    
    print list(myset.intersection(goodset))  # [1, 3, 7]
    print list(myset.difference(goodset))    # [2, 4, 5, 6]
    

    That's good for both readability (IMHO) and performance.

    Second go (post-OP-edit):

    Create your list of good extensions as a set:

    IMAGE_TYPES = set(['.jpg','.jpeg','.gif','.bmp','.png'])
    

    and that will increase performance. Otherwise, what you have looks fine to me.

    0 讨论(0)
提交回复
热议问题