I did a test (Python 3.6.4, W7x64) to see which solution is the fastest for one folder, no subdirectories, to get a list of complete file paths for files with a specific extension.
To make it short, for this task os.listdir()
is the fastest and is 1.7x as fast as the next best: os.walk()
(with a break!), 2.7x as fast as pathlib
, 3.2x faster than os.scandir()
and 3.3x faster than glob
.
Please keep in mind, that those results will change when you need recursive results. If you copy/paste one method below, please add a .lower() otherwise .EXT would not be found when searching for .ext.
import os
import pathlib
import timeit
import glob
def a():
path = pathlib.Path().cwd()
list_sqlite_files = [str(f) for f in path.glob("*.sqlite")]
def b():
path = os.getcwd()
list_sqlite_files = [f.path for f in os.scandir(path) if os.path.splitext(f)[1] == ".sqlite"]
def c():
path = os.getcwd()
list_sqlite_files = [os.path.join(path, f) for f in os.listdir(path) if f.endswith(".sqlite")]
def d():
path = os.getcwd()
os.chdir(path)
list_sqlite_files = [os.path.join(path, f) for f in glob.glob("*.sqlite")]
def e():
path = os.getcwd()
list_sqlite_files = [os.path.join(path, f) for f in glob.glob1(str(path), "*.sqlite")]
def f():
path = os.getcwd()
list_sqlite_files = []
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith(".sqlite"):
list_sqlite_files.append( os.path.join(root, file) )
break
print(timeit.timeit(a, number=1000))
print(timeit.timeit(b, number=1000))
print(timeit.timeit(c, number=1000))
print(timeit.timeit(d, number=1000))
print(timeit.timeit(e, number=1000))
print(timeit.timeit(f, number=1000))
Results:
# Python 3.6.4
0.431
0.515
0.161
0.548
0.537
0.274