How to use glob() to find files recursively?

前端 未结 28 1810
天涯浪人
天涯浪人 2020-11-21 22:54

This is what I have:

glob(os.path.join(\'src\',\'*.c\'))

but I want to search the subfolders of src. Something like this would work:

<
28条回答
  •  一向
    一向 (楼主)
    2020-11-21 23:29

    In case this may interest anyone, I've profiled the top three proposed methods. I have about ~500K files in the globbed folder (in total), and 2K files that match the desired pattern.

    here's the (very basic) code

    import glob
    import json
    import fnmatch
    import os
    from pathlib import Path
    from time import time
    
    
    def find_files_iglob():
        return glob.iglob("./data/**/data.json", recursive=True)
    
    
    def find_files_oswalk():
        for root, dirnames, filenames in os.walk('data'):
            for filename in fnmatch.filter(filenames, 'data.json'):
                yield os.path.join(root, filename)
    
    def find_files_rglob():
        return Path('data').rglob('data.json')
    
    t0 = time()
    for f in find_files_oswalk(): pass    
    t1 = time()
    for f in find_files_rglob(): pass
    t2 = time()
    for f in find_files_iglob(): pass 
    t3 = time()
    print(t1-t0, t2-t1, t3-t2)
    

    And the results I got were:
    os_walk: ~3.6sec
    rglob ~14.5sec
    iglob: ~16.9sec

    The platform: Ubuntu 16.04, x86_64 (core i7),

提交回复
热议问题