Temporary variable within list comprehension

后端 未结 3 2147
[愿得一人]
[愿得一人] 2021-02-19 05:07

It happens to me quite often to have a piece of code that looks like this.

raw_data  = [(s.split(\',\')[0], s.split(\',\')[1]) for s in all_lines if s.split(\',         


        
相关标签:
3条回答
  • 2021-02-19 05:18

    If you have two actions for processing, you may embed another list comprehension:

    raw_data  = [(lhs, rhs) 
                for lhs, rhs 
                in [s.split(',')[:2] for s in all_lines]
                if rhs != '"NaN"']
    

    You can use generator inside (it gives a small performance gain too):

                in (s.split(',')[:2] for s in all_lines)
    

    It will even be faster than your implementation:

    import timeit
    
    setup = '''import random, string;
    all_lines = [','.join((random.choice(string.letters),
                        str(random.random() if random.random() > 0.3 else '"NaN"')))
                        for i in range(10000)]'''
    oneloop = '''[(s.split(',')[0], s.split(',')[1]) 
                  for s in all_lines if s.split(',')[1] != '"NaN"']'''
    twoloops = '''raw_data  = [(lhs, rhs) 
                    for lhs, rhs 
                    in [s.split(',') for s in all_lines]
                    if rhs != '"NaN"']'''
    
    timeit.timeit(oneloop, setup, number=1000)  # 7.77 secs
    timeit.timeit(twoloops, setup, number=1000) # 4.68 secs
    
    0 讨论(0)
  • 2021-02-19 05:22

    You can't.

    A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it.

    From here

    Assignment in Python is not an expression.

    As Padraic Cunningham comments - if you need to split it multiple times don't do it in list comprehension.

    0 讨论(0)
  • 2021-02-19 05:29

    Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), it's possible to use a local variable within a list comprehension in order to avoid calling twice the same expression:

    In our case, we can name the evaluation of line.split(',') as a variable parts while using the result of the expression to filter the list if parts[1] is not equal to NaN; and thus re-use parts to produce the mapped value:

    # lines = ['1,2,3,4', '5,NaN,7,8']
    [(parts[0], parts[1]) for line in lines if (parts := line.split(','))[1] != 'NaN']
    # [('1', '2')]
    
    0 讨论(0)
提交回复
热议问题