Python and urllib

前端 未结 3 1049
感动是毒
感动是毒 2020-12-19 07:12

I\'m trying to download a zip file (\"tl_2008_01001_edges.zip\") from an ftp census site using urllib. What form is the zip file in when I get it and how do I save it?

相关标签:
3条回答
  • 2020-12-19 07:53

    Per the docs, urlretrieve puts the file to disk and returns a tuple (filename, headers). So the file is already saved when urlretrieve returns.

    You can open and read the ZIP file you've retrieved with the zipfile module of the standard library. glob does not work inside zipfiles, only on normal filesystem directories.

    0 讨论(0)
  • 2020-12-19 07:57

    Use urllib2.urlopen() for the zip file data and directory listing.

    To process zip files with the zipfile module, you can write them to a disk file which is then passed to the zipfile.ZipFile constructor. Retrieving the data is straightforward using read() on the file-like object returned by urllib2.urlopen().

    Fetching directories:

    >>> files = urllib2.urlopen('ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/').read().splitlines()
    >>> for l in files[:4]: print l
    ... 
    drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01001_Autauga_County
    drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01003_Baldwin_County
    drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01005_Barbour_County
    drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01007_Bibb_County
    >>> 
    

    Or, splitting for directory names:

    >>> for l in files[:4]: print l.split()[-1]
    ... 
    01001_Autauga_County
    01003_Baldwin_County
    01005_Barbour_County
    01007_Bibb_County
    
    0 讨论(0)
  • 2020-12-19 08:04
    import os,urllib2
    out=os.path.join("/tmp","test.zip")
    url="ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/01001_Autauga_County/tl_2008_01001_edges.zip"
    page=urllib2.urlopen(url)
    open(out,"wb").write(page.read())
    
    0 讨论(0)
提交回复
热议问题