urllib | 易学教程

Pause before retry connection in Python

阅读更多关于 Pause before retry connection in Python

问题 I am trying to connect to a server. Sometimes I cannot reach the server and would like to pause for a few seconds before trying again. How would I implement the pause feature in Python. Here is what I have so far. Thank you. while True: try: response = urllib.request.urlopen(http) except URLError as e: continue break I am using Python 3.2 回答1: This will block the thread for 2 seconds before continuing: import time time.sleep(2) 回答2: In case you want to run lots of these in parallel, it would

python3中urllib的基本使用

阅读更多关于 python3中urllib的基本使用

urllib 　　在python3中，urllib和urllib2进行了合并，现在只有一个urllib模块，urllib和urllib2的中的内容整合进了urllib.request，urlparse整合进了urllib.parse urlparse 　　将urlstr解析成各个组件 # -*- coding:utf-8 -*- import urllib.request import urllib.parse url = "http://www.baidu.com" parsed = urllib.parse.urlparse(url) print(parsed) #输出：ParseResult(scheme='http', netloc='www.baidu.com', path='', params='', query='', fragment='') urljoin（baseurl,newurl,allowFrag=None）　　将url的根域名和新url拼合成一个完整的url import urllib.parse url = "http://www.baidu.com" new_path = urllib.parse.urljoin(url,"index.html") print(new_path) #输出：http://www.baidu.com/index

Python 3 urllib with self-signed certificates

阅读更多关于 Python 3 urllib with self-signed certificates

问题 I'm attempting to download some data from an internal server using Python. Since it's internal, it uses a self-signed certificate. (We don't want to pay Verisign for servers that will never appear "in the wild.") The Python 2.6 version of the code worked fine. response = urllib2.urlopen(URL) data = csv.reader(response) I'm now trying to update to Python 3.4 (long story, don't ask.) However, using Python 3's urllib fails: response = urllib.request.urlopen(URL) It throws a CERTIFICATE_VERIFY

Python3: urllib.error.HTTPError: HTTP Error 403: Forbidden

阅读更多关于 Python3: urllib.error.HTTPError: HTTP Error 403: Forbidden

问题 Please, Help me! I am using Python3.3 and this code: import urllib.request import sys Open_Page = urllib.request.urlopen( "http://wowcircle.com" ).read().decode().encode('utf-8') And I take this: Traceback (most recent call last): File "C:\Users\1\Desktop\WCLauncer\reg.py", line 5, in <module> "http://forum.wowcircle.com" File "C:\Python33\lib\urllib\request.py", line 156, in urlopen return opener.open(url, data, timeout) File "C:\Python33\lib\urllib\request.py", line 475, in open response =

Python urllib,urllib2 fill form

阅读更多关于 Python urllib,urllib2 fill form

问题 I want to fill a HTML form with urllib2 and urllib . import urllib import urllib2 url = 'site.com/registration.php' values = {'password' : 'password', 'username': 'username' } data = urllib.urlencode(values) req = urllib2.Request(url, data) response = urllib2.urlopen(req) the_page = response.read() But on the end of the form is a button(input type='submit') . If you don't click the button you can't send the data what you wrote in the input(type text) How can I click the button with urllib and

how to convert characters like these,“a³ a¡ a´a§” in unicode, using python?

阅读更多关于 how to convert characters like these,“a³ a¡ a´a§” in unicode, using python?

问题 i'm making a crawler to get text html inside, i'm using beautifulsoup. when I open the url using urllib2, this library converts automatically the html that was using portuguese accents like " ã ó é õ " in another characters like these "a³ a¡ a´a§" what I want is just get the words without accents contrã¡rio -> contrario I tried to use this algoritm, bu this one just works when the text uses words like these "olá coração contrário" def strip_accents(s): return ''.join((c for c in unicodedata

pip, proxy authentication and “Not supported proxy scheme”

阅读更多关于 pip, proxy authentication and “Not supported proxy scheme”

问题 Trying to install pip on a new python installation. I am stuck with proxy errors. Looks like a bug in get-pip or urllib3 ?? Question is do I have to go through the pain of setting up CNTLM as described here or is there a shortcut? get-pip.py documentation says use --proxy="[user:passwd@]proxy.server:port" option to specify proxy and relevant authentication. But seems like pip passes on the whole thing as it is to urllib3 which interprets "myusr" as the url scheme, because of the ':' I guess (

Build query string using urlencode python

阅读更多关于 Build query string using urlencode python

问题 I am trying to build a url so that I can send a get request to it using urllib module. Let's suppose my final_url should be url = "www.example.com/find.php?data=http%3A%2F%2Fwww.stackoverflow.com&search=Generate+value" Now to achieve this I tried the following way: >>> initial_url = "http://www.stackoverflow.com" >>> search = "Generate+value" >>> params = {"data":initial_url,"search":search} >>> query_string = urllib.urlencode(params) >>> query_string 'search=Generate%2Bvalue&data=http%3A%2F

get file size before downloading using HTTP header not matching with one retrieved from urlopen

阅读更多关于 get file size before downloading using HTTP header not matching with one retrieved from urlopen

问题 why is the content-lenght different in case of using requests and urlopen(url).info() >>> url = 'http://pymotw.com/2/urllib/index.html' >>> requests.head(url).headers.get('content-length', None) '8176' >>> urllib.urlopen(url).info()['content-length'] '38227' >>> len(requests.get(url).content) 38274 I was going to make a check for size of file in bytes to split the buffer to multiple threads based on Range in urllib2 but if I do not have the actual size of file in bytes it won't work.. only

Connect to FTP server through http proxy

阅读更多关于 Connect to FTP server through http proxy

问题 My code belove gives me the error: socket.gaierror: [Errno 11001] getaddrinfo failed when calling the method ftp.connect(). My question is: why can I connect to google.com but when connecting to an ftp server it gives me error? And how I can connect to the ftp server from behind http proxy? import ftplib import urllib.request # ftp settings ftpusername = 'abc' ftppassword = 'xyz' ftp_host = 'host' ftp_port = 1234 proxy_url = 'http://username:password@host:port' proxy_support = urllib.request