urllib | 易学教程

Python and urllib

阅读更多关于 Python and urllib

问题 I'm trying to download a zip file ("tl_2008_01001_edges.zip") from an ftp census site using urllib. What form is the zip file in when I get it and how do I save it? I'm fairly new to Python and don't understand how urllib works. This is my attempt: import urllib, sys zip_file = urllib.urlretrieve("ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/Autauga_County/", "tl_2008_01001_edges.zip") If I know the list of ftp folders (or counties in this case), can I run through the ftp site list

How to use urllib in python 3?

阅读更多关于 How to use urllib in python 3?

问题 Here is my problem with urllib in python 3. I wrote a piece of code which works well in Python 2.7 and is using urllib2. It goes to the page on Internet (which requires authorization) and grabs me the info from that page. The real problem for me is that I can't make my code working in python 3.4 because there is no urllib2, and urllib works differently; even after few hours of googling and reading I got nothing. So if somebody can help me to solve this, I'd really appreciate that help. Here

Python入妖3-----Urllib库的基本使用

阅读更多关于 Python入妖3-----Urllib库的基本使用

什么是Urllib Urllib是python内置的HTTP请求库包括以下模块 urllib.request 请求模块 urllib.error 异常处理模块 urllib.parse url解析模块 urllib.robotparser robots.txt解析模块 urlopen 关于urllib.request.urlopen参数的介绍： urllib.request.urlopen(url, data=None, [timeout, ]*, cafile=None, capath=None, cadefault=False, context=None) url参数的使用先写一个简单的例子： import urllib.request ''''' Urllib 模块提供了读取web页面数据的接口，我们可以像读取本地文件一样读取www和ftp上的数据 urlopen 方法用来打开一个url read方法用于读取Url上的数据 ''' response = urllib.request.urlopen('http://www.baidu.com') print(response.read().decode('utf-8')) urlopen一般常用的有三个参数，它的参数如下： urllib.requeset.urlopen(url,data,timeout)

How to know if urllib.urlretrieve succeeds?

阅读更多关于 How to know if urllib.urlretrieve succeeds?

问题 urllib.urlretrieve returns silently even if the file doesn't exist on the remote http server, it just saves a html page to the named file. For example: urllib.urlretrieve('http://google.com/abc.jpg', 'abc.jpg') just returns silently, even if abc.jpg doesn't exist on google.com server, the generated abc.jpg is not a valid jpg file, it's actually a html page . I guess the returned headers (a httplib.HTTPMessage instance) can be used to actually tell whether the retrieval successes or not, but I

Which character produces “%3Q” from urllib.quote?

阅读更多关于 Which character produces “%3Q” from urllib.quote?

问题 We're encoding urls with user-supplied data using python's urllib.quote() ; one of the urls is producing a string with the following substring: '%3Q' I'm trying to figure out what the original user-supplied character is, but am getting stumped: I can't seem to find any resource that mentions the character that produces this output from urllib.quote() . Strangely, there are no results in stackoverflow with this string of characters. It's also important to note that the user-supplied data is

Which character produces “%3Q” from urllib.quote?

阅读更多关于 Which character produces “%3Q” from urllib.quote?

POST method in Python: errno 104

阅读更多关于 POST method in Python: errno 104

问题 I am trying to query a website in Python. I need to use a POST method (according to what is happening in my browser when I monitor it with the developer tools). If I query the website with cURL, it works well: curl -i --data "param1=var1&param2=var2" http://www.test.com I get this header: HTTP/1.1 200 OK Date: Tue, 26 Sep 2017 08:46:18 GMT Server: Apache/1.3.33 (Unix) mod_gzip/1.3.26.1a mod_fastcgi/2.4.2 PHP/4.3.11 Transfer-Encoding: chunked Content-Type: text/html But when I do it in Python

How to get round the HTTP Error 403: Forbidden with urllib.request using Python 3

阅读更多关于 How to get round the HTTP Error 403: Forbidden with urllib.request using Python 3

问题 Hi not every time but sometimes when trying to gain access to the LSE code I am thrown the every annoying HTTP Error 403: Forbidden message. Anyone know how I can overcome this issue only using standard python modules (so sadly no beautiful soup). import urllib.request url = "http://www.londonstockexchange.com/exchange/prices-and-markets/stocks/indices/ftse-indices.html" infile = urllib.request.urlopen(url) # Open the URL data = infile.read().decode('ISO-8859-1') # Read the content as string

Can't “import urllib.request, urllib.parse, urllib.error”

阅读更多关于 Can't “import urllib.request, urllib.parse, urllib.error”

问题 I trying to convert my project to python3. My server script is server.py: #!/usr/bin/env python #-*-coding:utf8-*- import http.server import os, sys server = http.server.HTTPServer handler = http.server.CGIHTTPRequestHandler server_address = ("", 8080) #handler.cgi_directories = [""] httpd = server(server_address, handler) httpd.serve_forever() But when I try: import urllib.request, urllib.parse, urllib.error I get this in terminal of python3 ./server.py: import urllib.request, urllib.parse,

python爬虫二、Urllib库的基本使用

阅读更多关于 python爬虫二、Urllib库的基本使用

什么是Urllib 　　Urllib是python内置的HTTP请求库　　包括以下模块　　urllib.request 请求模块　　urllib.error 异常处理模块　　urllib.parse url解析模块　　urllib.robotparser robots.txt解析模块 urlopen 　　关于urllib.request.urlopen参数的介绍：　　urllib.request.urlopen(url, data=None, [timeout, ]*, cafile=None, capath=None, cadefault=False, context=None) url参数的使用先写一个简单的例子： import urllib.request response = urllib.request.urlopen('http://www.baidu.com') print(response.read().decode('utf-8')) urlopen一般常用的有三个参数，它的参数如下： urllib.requeset.urlopen(url,data,timeout) response.read()可以获取到网页的内容，如果没有read()，将返回如下内容 data参数的使用上述的例子是通过请求百度的get请求获得百度