I\'m using this script to connect to sample ftp server and list available directories:
from ftplib import FTP
ftp = FTP(\'ftp.cwi.nl\') # connect to host, defa
=> I found this web-page while googling for a way to check if a file exists using ftplib in python. The following is what I figured out (hope it helps someone):
=> When trying to list non-existent files/directories, ftplib raises an exception. Even though Adding a try/except block is a standard practice and a good idea, I would prefer my FTP scripts to download file(s) only after making sure they exist. This helps in keeping my scripts simpler - at least when listing a directory on the FTP server is possible.
For example, the Edgar FTP server has multiple files that are stored under the directory /edgar/daily-index/. Each file is named liked "master.YYYYMMDD.idx". There is no guarantee that a file will exist for every date (YYYYMMDD) - there is no file dated 24th Nov 2013, but there is a file dated: 22th Nov 2013. How does listing work in these two cases?
# Code
from __future__ import print_function
import ftplib
ftp_client = ftplib.FTP("ftp.sec.gov", "anonymous", "MY.EMAIL@gmail.com")
resp = ftp_client.sendcmd("MLST /edgar/daily-index/master.20131122.idx")
print(resp)
resp = ftp_client.sendcmd("MLST /edgar/daily-index/master.20131124.idx")
print(resp)
# Output
250-Start of list for /edgar/daily-index/master.20131122.idx
modify=20131123030124;perm=adfr;size=301580;type=file;unique=11UAEAA398;
UNIX.group=1;UNIX.mode=0644;UNIX.owner=1019;
/edgar/daily-index/master.20131122.idx
250 End of list
Traceback (most recent call last):
File "", line 10, in <module>
resp = ftp_client.sendcmd("MLST /edgar/daily-index/master.20131124.idx")
File "lib/python2.7/ftplib.py", line 244, in sendcmd
return self.getresp()
File "lib/python2.7/ftplib.py", line 219, in getresp
raise error_perm, resp
ftplib.error_perm: 550 '/edgar/daily-index/master.20131124.idx' cannot be listed
As expected, listing a non-existent file generates an exception.
=> Since I know that the Edgar FTP server will surely have the directory /edgar/daily-index/, my script can do the following to avoid raising exceptions due to non-existent files:
a) list this directory.
b) download the required file(s) if they are are present in this listing - To check the listing I typically perform a regexp search, on the list of strings that the listing operation returns.
For example this script tries to download files for the past three days. If a file is found for a certain date then it is downloaded, else nothing happens.
import ftplib
import re
from datetime import date, timedelta
ftp_client = ftplib.FTP("ftp.sec.gov", "anonymous", "MY.EMAIL@gmail.com")
listing = []
# List the directory and store each directory entry as a string in an array
ftp_client.retrlines("LIST /edgar/daily-index", listing.append)
# go back 1,2 and 3 days
for diff in [1,2,3]:
today = (date.today() - timedelta(days=diff)).strftime("%Y%m%d")
month = (date.today() - timedelta(days=diff)).strftime("%Y_%m")
# the absolute path of the file we want to download - if it indeed exists
file_path = "/edgar/daily-index/master.%(date)s.idx" % { "date": today }
# create a regex to match the file's name
pattern = re.compile("master.%(date)s.idx" % { "date": today })
# filter out elements from the listing that match the pattern
found = filter(lambda x: re.search(pattern, x) != None, listing)
if( len(found) > 0 ):
ftp_client.retrbinary(
"RETR %(file_path)s" % { "file_path": file_path },
open(
'./edgar/daily-index/%(month)s/master.%(date)s.idx' % {
"date": today
}, 'wb'
).write
)
=> Interestingly, there are situations where we cannot list a directory on the FTP server. The edgar FTP server, for example, disallows listing on /edgar/data because it contains far too many sub-directories. In such cases, I wouldn't be able to use the "List and check for existence" approach described here - in these cases I would have to use exception handling in my downloader script to recover from non-existent file/directory access attempts.
You can send "MLST path" over the control connection. That will return a line including the type of the path (notice 'type=dir' down here):
250-Listing "/home/user":
modify=20131113091701;perm=el;size=4096;type=dir;unique=813gc0004; /
250 End MLST.
Translated into python that should be something along these lines:
import ftplib
ftp = ftplib.FTP()
ftp.connect('ftp.somedomain.com', 21)
ftp.login()
resp = ftp.sendcmd('MLST pathname')
if 'type=dir;' in resp:
# it should be a directory
pass
Of course the code above is not 100% reliable and would need a 'real' parser. You can look at the implementation of MLSD command in ftplib.py which is very similar (MLSD differs from MLST in that the response in sent over the data connection but the format of the lines being transmitted is the same): http://hg.python.org/cpython/file/8af2dc11464f/Lib/ftplib.py#l577
The examples attached to ghostdog74's answer have a bit of a bug: the list you get back is the whole line of the response, so you get something like
drwxrwxrwx 4 5063 5063 4096 Sep 13 20:00 resized
This means if your directory name is something like '50' (which is was in my case), you'll get a false positive. I modified the code to handle this:
def directory_exists_here(self, directory_name):
filelist = []
self.ftp.retrlines('LIST',filelist.append)
for f in filelist:
if f.split()[-1] == directory_name:
return True
return False
N.B., this is inside an FTP wrapper class I wrote and self.ftp is the actual FTP connection.
In 3.x nlst()
method is deprecated. Use this code:
import ftplib
remote = ftplib.FTP('example.com')
remote.login()
if 'foo' in [name for name, data in list(remote.mlsd())]:
# do your stuff
The list()
call is needed because mlsd()
returns a generator and they do not support checking what is in them (do not have __contains__()
method).
You can wrap [name for name, data in list(remote.mlsd())]
list comp in a function of method and call it when you will need to just check if a directory (or file) exists.
you can use a list. example
import ftplib
server="localhost"
user="user"
password="test@email.com"
try:
ftp = ftplib.FTP(server)
ftp.login(user,password)
except Exception,e:
print e
else:
filelist = [] #to store all files
ftp.retrlines('LIST',filelist.append) # append to list
f=0
for f in filelist:
if "public_html" in f:
#do something
f=1
if f==0:
print "No public_html"
#do your processing here
Tom is correct, but no one voted him up however for the satisfaction who voted up ghostdog74 I will mix and write this code, works for me, should work for you guys.
import ftplib
server="localhost"
user="user"
uploadToDir="public_html"
password="test@email.com"
try:
ftp = ftplib.FTP(server)
ftp.login(user,password)
except Exception,e:
print e
else:
filelist = [] #to store all files
ftp.retrlines('NLST',filelist.append) # append to list
num=0
for f in filelist:
if f.split()[-1] == uploadToDir:
#do something
num=1
if num==0:
print "No public_html"
#do your processing here
first of all if you follow ghost dog method, even if you say directory "public" in f, even when it doesnt exist it will evaluate to true because the word public exist in "public_html" so thats where Tom if condition can be used so I changed it to if f.split()[-1] == uploadToDir:.
Also if you enter a directory name somethig that doesnt exist but some files and folder exist the second by ghostdog74 will never execute because its never 0 as overridden by f in for loop so I used num variable instead of f and voila the goodness follows...
Vinay and Jonathon are right about what they commented.