How to use glob() to find files recursively?

前端 未结 28 1802
天涯浪人
天涯浪人 2020-11-21 22:54

This is what I have:

glob(os.path.join(\'src\',\'*.c\'))

but I want to search the subfolders of src. Something like this would work:

<
相关标签:
28条回答
  • 2020-11-21 23:23

    Similar to other solutions, but using fnmatch.fnmatch instead of glob, since os.walk already listed the filenames:

    import os, fnmatch
    
    
    def find_files(directory, pattern):
        for root, dirs, files in os.walk(directory):
            for basename in files:
                if fnmatch.fnmatch(basename, pattern):
                    filename = os.path.join(root, basename)
                    yield filename
    
    
    for filename in find_files('src', '*.c'):
        print 'Found C source:', filename
    

    Also, using a generator alows you to process each file as it is found, instead of finding all the files and then processing them.

    0 讨论(0)
  • 2020-11-21 23:23

    Or with a list comprehension:

     >>> base = r"c:\User\xtofl"
     >>> binfiles = [ os.path.join(base,f) 
                for base, _, files in os.walk(root) 
                for f in files if f.endswith(".jpg") ] 
    
    0 讨论(0)
  • 2020-11-21 23:24

    Just made this.. it will print files and directory in hierarchical way

    But I didn't used fnmatch or walk

    #!/usr/bin/python
    
    import os,glob,sys
    
    def dirlist(path, c = 1):
    
            for i in glob.glob(os.path.join(path, "*")):
                    if os.path.isfile(i):
                            filepath, filename = os.path.split(i)
                            print '----' *c + filename
    
                    elif os.path.isdir(i):
                            dirname = os.path.basename(i)
                            print '----' *c + dirname
                            c+=1
                            dirlist(i,c)
                            c-=1
    
    
    path = os.path.normpath(sys.argv[1])
    print(os.path.basename(path))
    dirlist(path)
    
    0 讨论(0)
  • 2020-11-21 23:24

    In addition to the suggested answers, you can do this with some lazy generation and list comprehension magic:

    import os, glob, itertools
    
    results = itertools.chain.from_iterable(glob.iglob(os.path.join(root,'*.c'))
                                                   for root, dirs, files in os.walk('src'))
    
    for f in results: print(f)
    

    Besides fitting in one line and avoiding unnecessary lists in memory, this also has the nice side effect, that you can use it in a way similar to the ** operator, e.g., you could use os.path.join(root, 'some/path/*.c') in order to get all .c files in all sub directories of src that have this structure.

    0 讨论(0)
  • 2020-11-21 23:25

    Consider pathlib.rglob().

    This is like calling Path.glob() with "**/" added in front of the given relative pattern:

    import pathlib
    
    
    for p in pathlib.Path("src").rglob("*.c"):
        print(p)
    

    See also @taleinat's related post here and a similar post elsewhere.

    0 讨论(0)
  • 2020-11-21 23:26

    Here is my solution using list comprehension to search for multiple file extensions recursively in a directory and all subdirectories:

    import os, glob
    
    def _globrec(path, *exts):
    """ Glob recursively a directory and all subdirectories for multiple file extensions 
        Note: Glob is case-insensitive, i. e. for '\*.jpg' you will get files ending
        with .jpg and .JPG
    
        Parameters
        ----------
        path : str
            A directory name
        exts : tuple
            File extensions to glob for
    
        Returns
        -------
        files : list
            list of files matching extensions in exts in path and subfolders
    
        """
        dirs = [a[0] for a in os.walk(path)]
        f_filter = [d+e for d in dirs for e in exts]    
        return [f for files in [glob.iglob(files) for files in f_filter] for f in files]
    
    my_pictures = _globrec(r'C:\Temp', '\*.jpg','\*.bmp','\*.png','\*.gif')
    for f in my_pictures:
        print f
    
    0 讨论(0)
提交回复
热议问题