I\'m trying to download a large number of files that all share a common string (DEM
) from an FTP sever. These files are nested inside multiple directories. For
Ok, seems to work. There may be issues if trying to download a directory, or scan a file. Exception handling may come handy to trap wrong filetypes and skip.
glob.glob
cannot work since you're on a remote filesystem, but you can use fnmatch
to match the names
Here's the code: it download all files matching *DEM*
in TEMP directory, sorting by directory.
import ftplib,sys,fnmatch,os
output_root = os.getenv("TEMP")
fc = ftplib.FTP("ftp.igsb.uiowa.edu")
fc.login()
fc.cwd("/gis_library/counties")
root_dirs = fc.nlst()
for l in root_dirs:
sys.stderr.write(l + " ...\n")
#print(fc.size(l))
dir_files = fc.nlst(l)
local_dir = os.path.join(output_root,l)
if not os.path.exists(local_dir):
os.mkdir(local_dir)
for f in dir_files:
if fnmatch.fnmatch(f,"*DEM*"): # cannot use glob.glob
sys.stderr.write("downloading "+l+"/"+f+" ...\n")
local_filename = os.path.join(local_dir,f)
with open(local_filename, 'wb') as fh:
fc.retrbinary('RETR '+ l + "/" + f, fh.write)
fc.close()
The answer by @Jean with the local pattern matching is the correct portable solution adhering to FTP standards.
Though as most FTP servers do support non-standard wildcard use with file listing commands, you can almost always use a simpler and mainly more efficient solution like:
files = ftp.nlst("*DEM*")
for f in files:
with open(f, 'wb') as fh:
ftp.retrbinary('RETR ' + f, fh.write)