问题
I'm trying to download a large number of files that all share a common string (DEM
) from an FTP sever. These files are nested inside multiple directories. For example, Adair/DEM*
and Adams/DEM*
The FTP sever is located here: ftp://ftp.igsb.uiowa.edu/gis_library/counties/
and requires no username and password.
So, I'd like to go through each county and download the files containing the string DEM
.
I've read many questions here on Stack Overflow and the documentation from Python, but cannot figure out how to use ftplib.FTP()
to get into the site without a username and password (which is not required), and I can't figure out how to grep or use glob.glob
inside of ftplib or urllib.
Thanks in advance for your help
回答1:
Ok, seems to work. There may be issues if trying to download a directory, or scan a file. Exception handling may come handy to trap wrong filetypes and skip.
glob.glob
cannot work since you're on a remote filesystem, but you can use fnmatch
to match the names
Here's the code: it download all files matching *DEM*
in TEMP directory, sorting by directory.
import ftplib,sys,fnmatch,os
output_root = os.getenv("TEMP")
fc = ftplib.FTP("ftp.igsb.uiowa.edu")
fc.login()
fc.cwd("/gis_library/counties")
root_dirs = fc.nlst()
for l in root_dirs:
sys.stderr.write(l + " ...\n")
#print(fc.size(l))
dir_files = fc.nlst(l)
local_dir = os.path.join(output_root,l)
if not os.path.exists(local_dir):
os.mkdir(local_dir)
for f in dir_files:
if fnmatch.fnmatch(f,"*DEM*"): # cannot use glob.glob
sys.stderr.write("downloading "+l+"/"+f+" ...\n")
local_filename = os.path.join(local_dir,f)
with open(local_filename, 'wb') as fh:
fc.retrbinary('RETR '+ l + "/" + f, fh.write)
fc.close()
回答2:
The answer by @Jean with the local pattern matching is the correct portable solution adhering to FTP standards.
Though as most FTP servers do support non-standard wildcard use with file listing commands, you can almost always use a simpler and mainly more efficient solution like:
files = ftp.nlst("*DEM*")
for f in files:
with open(f, 'wb') as fh:
ftp.retrbinary('RETR ' + f, fh.write)
来源:https://stackoverflow.com/questions/38943398/download-files-from-an-ftp-server-containing-given-string-using-python