Suppose I have some Python code like the following:
input = open(\"input.txt\")
x = (process_line(line) for line in input)
y = (process_item(item) for item in x)
You can't really parallelize reading from or writing to files; these will be your bottleneck, ultimately. Are you sure your bottleneck here is CPU, and not I/O?
Since your processing contains no dependencies (according to you), it's trivially simple to use Python's multiprocessing.Pool class.
There are a couple ways to write this, but the easier w.r.t. debugging is to find independent critical paths (slowest part of the code), which we will make run parallel. Let's presume it's process_item.
…And that's it, actually. Code:
import multiprocessing.Pool
p = multiprocessing.Pool() # use all available CPUs
input = open("input.txt")
x = (process_line(line) for line in input)
y = p.imap(process_item, x)
z = (generate_output_line(item) + "\n" for item in y)
output = open("output.txt", "w")
output.writelines(z)
I haven't tested it, but this is the basic idea. Pool's imap method makes sure results are returned in the right order.