It happens to me quite often to have a piece of code that looks like this.
raw_data = [(s.split(\',\')[0], s.split(\',\')[1]) for s in all_lines if s.split(\',
If you have two actions for processing, you may embed another list comprehension:
raw_data = [(lhs, rhs)
for lhs, rhs
in [s.split(',')[:2] for s in all_lines]
if rhs != '"NaN"']
You can use generator inside (it gives a small performance gain too):
in (s.split(',')[:2] for s in all_lines)
It will even be faster than your implementation:
import timeit
setup = '''import random, string;
all_lines = [','.join((random.choice(string.letters),
str(random.random() if random.random() > 0.3 else '"NaN"')))
for i in range(10000)]'''
oneloop = '''[(s.split(',')[0], s.split(',')[1])
for s in all_lines if s.split(',')[1] != '"NaN"']'''
twoloops = '''raw_data = [(lhs, rhs)
for lhs, rhs
in [s.split(',') for s in all_lines]
if rhs != '"NaN"']'''
timeit.timeit(oneloop, setup, number=1000) # 7.77 secs
timeit.timeit(twoloops, setup, number=1000) # 4.68 secs