How to make Python check if ftp directory exists?

前端 未结 8 1824
一生所求
一生所求 2021-02-12 20:17

I\'m using this script to connect to sample ftp server and list available directories:

from ftplib import FTP
ftp = FTP(\'ftp.cwi.nl\')   # connect to host, defa         


        
相关标签:
8条回答
  • 2021-02-12 20:39

    => I found this web-page while googling for a way to check if a file exists using ftplib in python. The following is what I figured out (hope it helps someone):

    => When trying to list non-existent files/directories, ftplib raises an exception. Even though Adding a try/except block is a standard practice and a good idea, I would prefer my FTP scripts to download file(s) only after making sure they exist. This helps in keeping my scripts simpler - at least when listing a directory on the FTP server is possible.

    For example, the Edgar FTP server has multiple files that are stored under the directory /edgar/daily-index/. Each file is named liked "master.YYYYMMDD.idx". There is no guarantee that a file will exist for every date (YYYYMMDD) - there is no file dated 24th Nov 2013, but there is a file dated: 22th Nov 2013. How does listing work in these two cases?

    # Code
    from __future__ import print_function  
    import ftplib  
    
    ftp_client = ftplib.FTP("ftp.sec.gov", "anonymous", "MY.EMAIL@gmail.com")  
    resp = ftp_client.sendcmd("MLST /edgar/daily-index/master.20131122.idx")  
    print(resp)   
    resp = ftp_client.sendcmd("MLST /edgar/daily-index/master.20131124.idx")  
    print(resp)  
    
    # Output
    250-Start of list for /edgar/daily-index/master.20131122.idx  
    modify=20131123030124;perm=adfr;size=301580;type=file;unique=11UAEAA398;  
    UNIX.group=1;UNIX.mode=0644;UNIX.owner=1019;  
    /edgar/daily-index/master.20131122.idx
    250 End of list  
    
    Traceback (most recent call last):
    File "", line 10, in <module>
    resp = ftp_client.sendcmd("MLST /edgar/daily-index/master.20131124.idx")
    File "lib/python2.7/ftplib.py", line 244, in sendcmd
    return self.getresp()
    File "lib/python2.7/ftplib.py", line 219, in getresp
    raise error_perm, resp
    ftplib.error_perm: 550 '/edgar/daily-index/master.20131124.idx' cannot be listed
    

    As expected, listing a non-existent file generates an exception.

    => Since I know that the Edgar FTP server will surely have the directory /edgar/daily-index/, my script can do the following to avoid raising exceptions due to non-existent files:
    a) list this directory.
    b) download the required file(s) if they are are present in this listing - To check the listing I typically perform a regexp search, on the list of strings that the listing operation returns.

    For example this script tries to download files for the past three days. If a file is found for a certain date then it is downloaded, else nothing happens.

    import ftplib
    import re
    from datetime import date, timedelta
    
    ftp_client = ftplib.FTP("ftp.sec.gov", "anonymous", "MY.EMAIL@gmail.com")
    listing = []
    # List the directory and store each directory entry as a string in an array
    ftp_client.retrlines("LIST /edgar/daily-index", listing.append)
    # go back 1,2 and 3 days
    for diff in [1,2,3]:
      today = (date.today() - timedelta(days=diff)).strftime("%Y%m%d")
      month = (date.today() - timedelta(days=diff)).strftime("%Y_%m")
      # the absolute path of the file we want to download - if it indeed exists
      file_path = "/edgar/daily-index/master.%(date)s.idx" % { "date": today }
      # create a regex to match the file's name
      pattern = re.compile("master.%(date)s.idx" % { "date": today })
      # filter out elements from the listing that match the pattern
      found = filter(lambda x: re.search(pattern, x) != None, listing)
      if( len(found) > 0 ):
        ftp_client.retrbinary(
          "RETR %(file_path)s" % { "file_path": file_path },
          open(
            './edgar/daily-index/%(month)s/master.%(date)s.idx' % {
              "date": today
            }, 'wb'
          ).write
        )
    

    => Interestingly, there are situations where we cannot list a directory on the FTP server. The edgar FTP server, for example, disallows listing on /edgar/data because it contains far too many sub-directories. In such cases, I wouldn't be able to use the "List and check for existence" approach described here - in these cases I would have to use exception handling in my downloader script to recover from non-existent file/directory access attempts.

    0 讨论(0)
  • 2021-02-12 20:43

    You can send "MLST path" over the control connection. That will return a line including the type of the path (notice 'type=dir' down here):

    250-Listing "/home/user":
     modify=20131113091701;perm=el;size=4096;type=dir;unique=813gc0004; /
    250 End MLST.
    

    Translated into python that should be something along these lines:

    import ftplib
    ftp = ftplib.FTP()
    ftp.connect('ftp.somedomain.com', 21)
    ftp.login()
    resp = ftp.sendcmd('MLST pathname')
    if 'type=dir;' in resp:
        # it should be a directory
        pass
    

    Of course the code above is not 100% reliable and would need a 'real' parser. You can look at the implementation of MLSD command in ftplib.py which is very similar (MLSD differs from MLST in that the response in sent over the data connection but the format of the lines being transmitted is the same): http://hg.python.org/cpython/file/8af2dc11464f/Lib/ftplib.py#l577

    0 讨论(0)
  • 2021-02-12 20:44

    The examples attached to ghostdog74's answer have a bit of a bug: the list you get back is the whole line of the response, so you get something like

    drwxrwxrwx    4 5063     5063         4096 Sep 13 20:00 resized
    

    This means if your directory name is something like '50' (which is was in my case), you'll get a false positive. I modified the code to handle this:

    def directory_exists_here(self, directory_name):
        filelist = []
        self.ftp.retrlines('LIST',filelist.append)
        for f in filelist:
            if f.split()[-1] == directory_name:
                return True
        return False
    

    N.B., this is inside an FTP wrapper class I wrote and self.ftp is the actual FTP connection.

    0 讨论(0)
  • 2021-02-12 20:44

    In 3.x nlst() method is deprecated. Use this code:

    import ftplib
    
    remote = ftplib.FTP('example.com')
    remote.login()
    
    if 'foo' in [name for name, data in list(remote.mlsd())]:
        # do your stuff
    

    The list() call is needed because mlsd() returns a generator and they do not support checking what is in them (do not have __contains__() method).

    You can wrap [name for name, data in list(remote.mlsd())] list comp in a function of method and call it when you will need to just check if a directory (or file) exists.

    0 讨论(0)
  • 2021-02-12 20:46

    you can use a list. example

    import ftplib
    server="localhost"
    user="user"
    password="test@email.com"
    try:
        ftp = ftplib.FTP(server)    
        ftp.login(user,password)
    except Exception,e:
        print e
    else:    
        filelist = [] #to store all files
        ftp.retrlines('LIST',filelist.append)    # append to list  
        f=0
        for f in filelist:
            if "public_html" in f:
                #do something
                f=1
        if f==0:
            print "No public_html"
            #do your processing here
    
    0 讨论(0)
  • 2021-02-12 20:47

    Tom is correct, but no one voted him up however for the satisfaction who voted up ghostdog74 I will mix and write this code, works for me, should work for you guys.

    import ftplib
    server="localhost"
    user="user"
    uploadToDir="public_html"
    password="test@email.com"
    try:
        ftp = ftplib.FTP(server)    
        ftp.login(user,password)
    except Exception,e:
        print e
    else:    
        filelist = [] #to store all files
        ftp.retrlines('NLST',filelist.append)    # append to list  
        num=0
        for f in filelist:
            if f.split()[-1] == uploadToDir:
                #do something
                num=1
        if num==0:
            print "No public_html"
            #do your processing here
    

    first of all if you follow ghost dog method, even if you say directory "public" in f, even when it doesnt exist it will evaluate to true because the word public exist in "public_html" so thats where Tom if condition can be used so I changed it to if f.split()[-1] == uploadToDir:.

    Also if you enter a directory name somethig that doesnt exist but some files and folder exist the second by ghostdog74 will never execute because its never 0 as overridden by f in for loop so I used num variable instead of f and voila the goodness follows...

    Vinay and Jonathon are right about what they commented.

    0 讨论(0)
提交回复
热议问题