How to create a zip archive of a directory in Python?

后端 未结 25 2663
暗喜
暗喜 2020-11-22 07:12

How can I create a zip archive of a directory structure in Python?

相关标签:
25条回答
  • 2020-11-22 08:10

    How can I create a zip archive of a directory structure in Python?

    In a Python script

    In Python 2.7+, shutil has a make_archive function.

    from shutil import make_archive
    make_archive(
      'zipfile_name', 
      'zip',           # the archive format - or tar, bztar, gztar 
      root_dir=None,   # root for archive - current working dir if None
      base_dir=None)   # start archiving from here - cwd if None too
    

    Here the zipped archive will be named zipfile_name.zip. If base_dir is farther down from root_dir it will exclude files not in the base_dir, but still archive the files in the parent dirs up to the root_dir.

    I did have an issue testing this on Cygwin with 2.7 - it wants a root_dir argument, for cwd:

    make_archive('zipfile_name', 'zip', root_dir='.')
    

    Using Python from the shell

    You can do this with Python from the shell also using the zipfile module:

    $ python -m zipfile -c zipname sourcedir
    

    Where zipname is the name of the destination file you want (add .zip if you want it, it won't do it automatically) and sourcedir is the path to the directory.

    Zipping up Python (or just don't want parent dir):

    If you're trying to zip up a python package with a __init__.py and __main__.py, and you don't want the parent dir, it's

    $ python -m zipfile -c zipname sourcedir/*
    

    And

    $ python zipname
    

    would run the package. (Note that you can't run subpackages as the entry point from a zipped archive.)

    Zipping a Python app:

    If you have python3.5+, and specifically want to zip up a Python package, use zipapp:

    $ python -m zipapp myapp
    $ python myapp.pyz
    
    0 讨论(0)
  • 2020-11-22 08:10

    This function will recursively zip up a directory tree, compressing the files, and recording the correct relative filenames in the archive. The archive entries are the same as those generated by zip -r output.zip source_dir.

    import os
    import zipfile
    def make_zipfile(output_filename, source_dir):
        relroot = os.path.abspath(os.path.join(source_dir, os.pardir))
        with zipfile.ZipFile(output_filename, "w", zipfile.ZIP_DEFLATED) as zip:
            for root, dirs, files in os.walk(source_dir):
                # add directory (needed for empty dirs)
                zip.write(root, os.path.relpath(root, relroot))
                for file in files:
                    filename = os.path.join(root, file)
                    if os.path.isfile(filename): # regular files only
                        arcname = os.path.join(os.path.relpath(root, relroot), file)
                        zip.write(filename, arcname)
    
    0 讨论(0)
  • 2020-11-22 08:10

    So many answers here, and I hope I might contribute with my own version, which is based on the original answer (by the way), but with a more graphical perspective, also using context for each zipfile setup and sorting os.walk(), in order to have a ordered output.

    Having these folders and them files (among other folders), I wanted to create a .zip for each cap_ folder:

    $ tree -d
    .
    ├── cap_01
    |    ├── 0101000001.json
    |    ├── 0101000002.json
    |    ├── 0101000003.json
    |
    ├── cap_02
    |    ├── 0201000001.json
    |    ├── 0201000002.json
    |    ├── 0201001003.json
    |
    ├── cap_03
    |    ├── 0301000001.json
    |    ├── 0301000002.json
    |    ├── 0301000003.json
    | 
    ├── docs
    |    ├── map.txt
    |    ├── main_data.xml
    |
    ├── core_files
         ├── core_master
         ├── core_slave
    

    Here's what I applied, with comments for better understanding of the process.

    $ cat zip_cap_dirs.py 
    """ Zip 'cap_*' directories. """           
    import os                                                                       
    import zipfile as zf                                                            
    
    
    for root, dirs, files in sorted(os.walk('.')):                                                                                               
        if 'cap_' in root:                                                          
            print(f"Compressing: {root}")                                           
            # Defining .zip name, according to Capítulo.                            
            cap_dir_zip = '{}.zip'.format(root)                                     
            # Opening zipfile context for current root dir.                         
            with zf.ZipFile(cap_dir_zip, 'w', zf.ZIP_DEFLATED) as new_zip:          
                # Iterating over os.walk list of files for the current root dir.    
                for f in files:                                                     
                    # Defining relative path to files from current root dir.        
                    f_path = os.path.join(root, f)                                  
                    # Writing the file on the .zip file of the context              
                    new_zip.write(f_path) 
    

    Basically, for each iteration over os.walk(path), I'm opening a context for zipfile setup and afterwards, iterating iterating over files, which is a list of files from root directory, forming the relative path for each file based on the current root directory, appending to the zipfile context which is running.

    And the output is presented like this:

    $ python3 zip_cap_dirs.py
    Compressing: ./cap_01
    Compressing: ./cap_02
    Compressing: ./cap_03
    

    To see the contents of each .zip directory, you can use less command:

    $ less cap_01.zip
    
    Archive:  cap_01.zip
     Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
    --------  ------  ------- ---- ---------- ----- --------  ----
      22017  Defl:N     2471  89% 2019-09-05 08:05 7a3b5ec6  cap_01/0101000001.json
      21998  Defl:N     2471  89% 2019-09-05 08:05 155bece7  cap_01/0101000002.json
      23236  Defl:N     2573  89% 2019-09-05 08:05 55fced20  cap_01/0101000003.json
    --------          ------- ---                           -------
      67251             7515  89%                            3 files
    
    0 讨论(0)
  • 2020-11-22 08:11

    You probably want to look at the zipfile module; there's documentation at http://docs.python.org/library/zipfile.html.

    You may also want os.walk() to index the directory structure.

    0 讨论(0)
  • 2020-11-22 08:11

    I prepared a function by consolidating Mark Byers' solution with Reimund and Morten Zilmer's comments (relative path and including empty directories). As a best practice, with is used in ZipFile's file construction.

    The function also prepares a default zip file name with the zipped directory name and '.zip' extension. Therefore, it works with only one argument: the source directory to be zipped.

    import os
    import zipfile
    
    def zip_dir(path_dir, path_file_zip=''):
    if not path_file_zip:
        path_file_zip = os.path.join(
            os.path.dirname(path_dir), os.path.basename(path_dir)+'.zip')
    with zipfile.ZipFile(path_file_zip, 'wb', zipfile.ZIP_DEFLATED) as zip_file:
        for root, dirs, files in os.walk(path_dir):
            for file_or_dir in files + dirs:
                zip_file.write(
                    os.path.join(root, file_or_dir),
                    os.path.relpath(os.path.join(root, file_or_dir),
                                    os.path.join(path_dir, os.path.pardir)))
    
    0 讨论(0)
  • 2020-11-22 08:13

    Modern Python (3.6+) using the pathlib module for concise OOP-like handling of paths, and pathlib.Path.rglob() for recursive globbing. As far as I can tell, this is equivalent to George V. Reilly's answer: zips with compression, the topmost element is a directory, keeps empty dirs, uses relative paths.

    from pathlib import Path
    from zipfile import ZIP_DEFLATED, ZipFile
    
    from os import PathLike
    from typing import Union
    
    
    def zip_dir(zip_name: str, source_dir: Union[str, PathLike]):
        src_path = Path(source_dir).expanduser().resolve(strict=True)
        with ZipFile(zip_name, 'w', ZIP_DEFLATED) as zf:
            for file in src_path.rglob('*'):
                zf.write(file, file.relative_to(src_path.parent))
    

    Note: as optional type hints indicate, zip_name can't be a Path object (would be fixed in 3.6.2+).

    0 讨论(0)
提交回复
热议问题