Current scenario: I have 900 files in a directory called directoryA. The files are named file0.txt through file 899.txt, each 15MB in size. I loop through e
To fully utilize your hardware core, it's better to use the multiprocessing library.
from multiprocessing import Pool
from os import listdir
import csv
def process_file(file):
#load the text file as list using csv module
#run a bunch of operations
#regex the int from the filename. for ex file1.txt returns 1, and file42.txt returns 42
#write out a corresponsding csv file in dirB. For example input file file99.txt is written as out99.csv
if __name__ == '__main__':
mypath = "some/path/"
inputDir = mypath + 'dirA/'
outputDir = mypath + 'dirB/'
p = Pool(12)
p.map(process_file, listdir(inputDir))
Document of multiprocessing: https://docs.python.org/2/library/multiprocessing.html