Downloading second file from ftp fails

眉间皱痕 提交于 2019-12-23 20:19:29

问题


I want to download multiple files from FTP in python. the my code works when I just download 1 file, but not works for more than one!

import urllib
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC1790863.tar.gz', 'file1.tar.gz')
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz', 'file2.tar.gz')

An error say:

Traceback (most recent call last):
  File "/home/ehsan/dev_center/bigADEVS-bknd/daemons/crawler/ftp_oa_crawler.py", line 3, in <module>
    urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz', 'file2.tar.gz')
  File "/usr/lib/python2.7/urllib.py", line 98, in urlretrieve
    return opener.retrieve(url, filename, reporthook, data)
  File "/usr/lib/python2.7/urllib.py", line 245, in retrieve
    fp = self.open(url, data)
  File "/usr/lib/python2.7/urllib.py", line 213, in open
    return getattr(self, name)(url)
  File "/usr/lib/python2.7/urllib.py", line 558, in open_ftp
    (fp, retrlen) = self.ftpcache[key].retrfile(file, type)
  File "/usr/lib/python2.7/urllib.py", line 906, in retrfile
    conn, retrlen = self.ftp.ntransfercmd(cmd)
  File "/usr/lib/python2.7/ftplib.py", line 334, in ntransfercmd
    host, port = self.makepasv()
  File "/usr/lib/python2.7/ftplib.py", line 312, in makepasv
    host, port = parse227(self.sendcmd('PASV'))
  File "/usr/lib/python2.7/ftplib.py", line 830, in parse227
    raise error_reply, resp
IOError: [Errno ftp error] 200 Type set to I

What should I do?


回答1:


It is a bug in urllib in python 2.7. Reported here. The reason behind the same is explained here

Now, when a user tries to download the same file or another file from same directory, the key (host, port, dirs) remains the same so open_ftp() skips ftp initialization. Because of this skipping, previous FTP connection is reused and when new commands are sent to the server, server first sends the previous ACK. This causes a domino effect and each response gets delayed by one and we get an exception from parse227()

A possible solution is to clear the cache that may have been built up by previous calls. You may use the urllib.urlcleanup() method calls between your urlretrieve calls for the same, as mentioned here.

Hope this helps!



来源:https://stackoverflow.com/questions/44733710/downloading-second-file-from-ftp-fails

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!