I need to figure out how to write file output to a compressed file in Python, similar to the two-liner below:
open ZIPPED, \"| gzip -c > zipped.gz\";
print ZI
Make sure you use the same compression level when comparing speeds. By default, linux gzip uses level 6, while python uses level 9. I tested this in Python 3.6.8 using gzip version 1.5, compressing 600MB of data from MySQL dump. With default settings:
python module takes 9.24 seconds and makes a file 47.1 MB
subprocess gzip takes 8.61 seconds and makes a file 48.5 MB
After changing it to level 6 so they match:
python module takes 8.09 seconds and makes a file 48.6 MB
subprocess gzip takes 8.55 seconds and makes a file 48.5 MB
# subprocess method
start = time.time()
with open(outfile, 'wb') as f:
subprocess.run(['gzip'], input=dump, stdout=f, check=True)
print('subprocess finished after {:.2f} seconds'.format(time.time() - start))
# gzip method
start = time.time()
with gzip.open(outfile2, 'wb', compresslevel=6) as z:
z.write(dump)
print('gzip module finished after {:.2f} seconds'.format(time.time() - start))