I want to read the contents of a zip file into memory rather than extracting them to disc, find a particular file in the archive, open the file and extract a line from it.
IMO just using read
is enough:
zfile = ZipFile('name.zip', 'r')
files = []
for name in zfile.namelist():
if fnmatch.fnmatch(name, '*_readme.xml'):
files.append(zfile.read(name))
This will make a list with contents of files that match the pattern.
Test: You can then parse contents afterwards by iterating through the list:
for file in files:
print(file[0:min(35,len(file))].decode()) # "parsing"
Or better use a functor:
import zipfile as zip
import os
import fnmatch
zip_name = os.sys.argv[1]
zfile = zip.ZipFile(zip_name, 'r')
def parse(contents, member_name = ""):
if len(member_name) > 0:
print( "Parsed `{}`:".format(member_name) )
print(contents[0:min(35, len(contents))].decode()) # "parsing"
for name in zfile.namelist():
if fnmatch.fnmatch(name, '*.cpp'):
parse(zfile.read(name), name)
This way there is no data kept in memory for no reason and memory foot print is smaller. It might be important if the files are big.