What is the Python way to walk a directory tree?

后端 未结 11 1549
走了就别回头了
走了就别回头了 2020-12-06 09:52

I feel that assigning files, and folders and doing the += [item] part is a bit hackish. Any suggestions? I\'m using Python 3.2

from os import *
from os.pat         


        
相关标签:
11条回答
  • 2020-12-06 10:06

    For anyone looking for a solution using pathlib (python >= 3.4)

    from pathlib import Path
    
    def walk(path): 
        for p in Path(path).iterdir(): 
            if p.is_dir(): 
                yield from walk(p)
                continue
            yield p.resolve()
    
    # recursively traverse all files from current directory
    for p in walk(Path('.')): 
        print(p)
    
    # the function returns a generator so if you need a list you need to build one
    all_files = list(walk(Path('.'))) 
    

    However, as mentioned above, this does not preserve the top-down ordering given by os.walk

    0 讨论(0)
  • 2020-12-06 10:08

    os.walk and os.scandir are great options, however, I've been using pathlib more and more, and with pathlib you can use the .glob() method:

    root_directory = Path(".")
    for path_object in root_directory.glob('**/*'):
        if path_object.is_file():
            print(f"hi, I'm a file: {path_object}")
        elif path_object.is_dir():
            print(f"hi, I'm a dir: {path_object}")
    
    
    
    0 讨论(0)
  • 2020-12-06 10:11
    def dir_contents(path):
        files,folders = [],[]
        for p in listdir(path):
            if isfile(p): files.append(p)
            else: folders.append(p)
        return files, folders
    
    0 讨论(0)
  • 2020-12-06 10:16

    Since Python 3.4 there is new module pathlib. So to get all dirs and files one can do:

    from pathlib import Path
    
    dirs = [str(item) for item in Path(path).iterdir() if item.is_dir()]
    files = [str(item) for item in Path(path).iterdir() if item.is_file()]
    
    0 讨论(0)
  • 2020-12-06 10:18

    Instead of the built-in os.walk and os.path.walk, I use something derived from this piece of code I found suggested elsewhere which I had originally linked to but have replaced with inlined source:

    import os
    import stat
    
    class DirectoryStatWalker:
        # a forward iterator that traverses a directory tree, and
        # returns the filename and additional file information
    
        def __init__(self, directory):
            self.stack = [directory]
            self.files = []
            self.index = 0
    
        def __getitem__(self, index):
            while 1:
                try:
                    file = self.files[self.index]
                    self.index = self.index + 1
                except IndexError:
                    # pop next directory from stack
                    self.directory = self.stack.pop()
                    self.files = os.listdir(self.directory)
                    self.index = 0
                else:
                    # got a filename
                    fullname = os.path.join(self.directory, file)
                    st = os.stat(fullname)
                    mode = st[stat.ST_MODE]
                    if stat.S_ISDIR(mode) and not stat.S_ISLNK(mode):
                        self.stack.append(fullname)
                    return fullname, st
    
    if __name__ == '__main__':
        for file, st in DirectoryStatWalker("/usr/include"):
            print file, st[stat.ST_SIZE]
    

    It walks the directories recursively and is quite efficient and easy to read.

    0 讨论(0)
提交回复
热议问题