Extract zip to memory, parse contents

前端 未结 4 805
孤独总比滥情好
孤独总比滥情好 2021-01-22 21:30

I want to read the contents of a zip file into memory rather than extracting them to disc, find a particular file in the archive, open the file and extract a line from it.

4条回答
  •  清歌不尽
    2021-01-22 22:15

    IMO just using read is enough:

    zfile = ZipFile('name.zip', 'r')
    files = []
    for name in zfile.namelist():
      if fnmatch.fnmatch(name, '*_readme.xml'):
        files.append(zfile.read(name))
    

    This will make a list with contents of files that match the pattern.

    Test: You can then parse contents afterwards by iterating through the list:

    for file in files:
      print(file[0:min(35,len(file))].decode()) # "parsing"
    

    Or better use a functor:

    import zipfile as zip
    import os
    import fnmatch
    
    zip_name = os.sys.argv[1]
    zfile = zip.ZipFile(zip_name, 'r')
    
    def parse(contents, member_name = ""):
      if len(member_name) > 0:
        print( "Parsed `{}`:".format(member_name) )  
      print(contents[0:min(35, len(contents))].decode()) # "parsing"
    
    for name in zfile.namelist():
      if fnmatch.fnmatch(name, '*.cpp'):
        parse(zfile.read(name), name)
    

    This way there is no data kept in memory for no reason and memory foot print is smaller. It might be important if the files are big.

提交回复
热议问题