Python and urllib

谁都会走 提交于 2019-12-29 07:39:11

问题


I'm trying to download a zip file ("tl_2008_01001_edges.zip") from an ftp census site using urllib. What form is the zip file in when I get it and how do I save it?

I'm fairly new to Python and don't understand how urllib works.

This is my attempt:

import urllib, sys

zip_file = urllib.urlretrieve("ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/Autauga_County/", "tl_2008_01001_edges.zip")

If I know the list of ftp folders (or counties in this case), can I run through the ftp site list using the glob function?

Thanks.


回答1:


Use urllib2.urlopen() for the zip file data and directory listing.

To process zip files with the zipfile module, you can write them to a disk file which is then passed to the zipfile.ZipFile constructor. Retrieving the data is straightforward using read() on the file-like object returned by urllib2.urlopen().

Fetching directories:

>>> files = urllib2.urlopen('ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/').read().splitlines()
>>> for l in files[:4]: print l
... 
drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01001_Autauga_County
drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01003_Baldwin_County
drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01005_Barbour_County
drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01007_Bibb_County
>>> 

Or, splitting for directory names:

>>> for l in files[:4]: print l.split()[-1]
... 
01001_Autauga_County
01003_Baldwin_County
01005_Barbour_County
01007_Bibb_County



回答2:


import os,urllib2
out=os.path.join("/tmp","test.zip")
url="ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/01001_Autauga_County/tl_2008_01001_edges.zip"
page=urllib2.urlopen(url)
open(out,"wb").write(page.read())



回答3:


Per the docs, urlretrieve puts the file to disk and returns a tuple (filename, headers). So the file is already saved when urlretrieve returns.

You can open and read the ZIP file you've retrieved with the zipfile module of the standard library. glob does not work inside zipfiles, only on normal filesystem directories.



来源:https://stackoverflow.com/questions/2289768/python-and-urllib

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!