It happens to me quite often to have a piece of code that looks like this.
raw_data = [(s.split(\',\')[0], s.split(\',\')[1]) for s in all_lines if s.split(\',
If you have two actions for processing, you may embed another list comprehension:
raw_data = [(lhs, rhs)
for lhs, rhs
in [s.split(',')[:2] for s in all_lines]
if rhs != '"NaN"']
You can use generator inside (it gives a small performance gain too):
in (s.split(',')[:2] for s in all_lines)
It will even be faster than your implementation:
import timeit
setup = '''import random, string;
all_lines = [','.join((random.choice(string.letters),
str(random.random() if random.random() > 0.3 else '"NaN"')))
for i in range(10000)]'''
oneloop = '''[(s.split(',')[0], s.split(',')[1])
for s in all_lines if s.split(',')[1] != '"NaN"']'''
twoloops = '''raw_data = [(lhs, rhs)
for lhs, rhs
in [s.split(',') for s in all_lines]
if rhs != '"NaN"']'''
timeit.timeit(oneloop, setup, number=1000) # 7.77 secs
timeit.timeit(twoloops, setup, number=1000) # 4.68 secs
You can't.
A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it.
From here
Assignment in Python is not an expression.
As Padraic Cunningham comments - if you need to split it multiple times don't do it in list comprehension.
Starting Python 3.8
, and the introduction of assignment expressions (PEP 572) (:=
operator), it's possible to use a local variable within a list comprehension in order to avoid calling twice the same expression:
In our case, we can name the evaluation of line.split(',')
as a variable parts
while using the result of the expression to filter the list if parts[1]
is not equal to NaN
; and thus re-use parts
to produce the mapped value:
# lines = ['1,2,3,4', '5,NaN,7,8']
[(parts[0], parts[1]) for line in lines if (parts := line.split(','))[1] != 'NaN']
# [('1', '2')]