urllib

Pause before retry connection in Python

别来无恙 提交于 2019-12-22 18:06:59
问题 I am trying to connect to a server. Sometimes I cannot reach the server and would like to pause for a few seconds before trying again. How would I implement the pause feature in Python. Here is what I have so far. Thank you. while True: try: response = urllib.request.urlopen(http) except URLError as e: continue break I am using Python 3.2 回答1: This will block the thread for 2 seconds before continuing: import time time.sleep(2) 回答2: In case you want to run lots of these in parallel, it would

python3中urllib的基本使用

允我心安 提交于 2019-12-22 16:58:58
urllib   在python3中,urllib和urllib2进行了合并,现在只有一个urllib模块,urllib和urllib2的中的内容整合进了urllib.request,urlparse整合进了urllib.parse urlparse   将urlstr解析成各个组件 # -*- coding:utf-8 -*- import urllib.request import urllib.parse url = "http://www.baidu.com" parsed = urllib.parse.urlparse(url) print(parsed) #输出:ParseResult(scheme='http', netloc='www.baidu.com', path='', params='', query='', fragment='') urljoin(baseurl,newurl,allowFrag=None)   将url的根域名和新url拼合成一个完整的url import urllib.parse url = "http://www.baidu.com" new_path = urllib.parse.urljoin(url,"index.html") print(new_path) #输出:http://www.baidu.com/index

Python 3 urllib with self-signed certificates

牧云@^-^@ 提交于 2019-12-22 11:15:38
问题 I'm attempting to download some data from an internal server using Python. Since it's internal, it uses a self-signed certificate. (We don't want to pay Verisign for servers that will never appear "in the wild.") The Python 2.6 version of the code worked fine. response = urllib2.urlopen(URL) data = csv.reader(response) I'm now trying to update to Python 3.4 (long story, don't ask.) However, using Python 3's urllib fails: response = urllib.request.urlopen(URL) It throws a CERTIFICATE_VERIFY

Python3: urllib.error.HTTPError: HTTP Error 403: Forbidden

霸气de小男生 提交于 2019-12-22 10:39:25
问题 Please, Help me! I am using Python3.3 and this code: import urllib.request import sys Open_Page = urllib.request.urlopen( "http://wowcircle.com" ).read().decode().encode('utf-8') And I take this: Traceback (most recent call last): File "C:\Users\1\Desktop\WCLauncer\reg.py", line 5, in <module> "http://forum.wowcircle.com" File "C:\Python33\lib\urllib\request.py", line 156, in urlopen return opener.open(url, data, timeout) File "C:\Python33\lib\urllib\request.py", line 475, in open response =

Python urllib,urllib2 fill form

最后都变了- 提交于 2019-12-22 10:05:00
问题 I want to fill a HTML form with urllib2 and urllib . import urllib import urllib2 url = 'site.com/registration.php' values = {'password' : 'password', 'username': 'username' } data = urllib.urlencode(values) req = urllib2.Request(url, data) response = urllib2.urlopen(req) the_page = response.read() But on the end of the form is a button(input type='submit') . If you don't click the button you can't send the data what you wrote in the input(type text) How can I click the button with urllib and

how to convert characters like these,“a³ a¡ a´a§” in unicode, using python?

荒凉一梦 提交于 2019-12-22 06:31:22
问题 i'm making a crawler to get text html inside, i'm using beautifulsoup. when I open the url using urllib2, this library converts automatically the html that was using portuguese accents like " ã ó é õ " in another characters like these "a³ a¡ a´a§" what I want is just get the words without accents contrã¡rio -> contrario I tried to use this algoritm, bu this one just works when the text uses words like these "olá coração contrário" def strip_accents(s): return ''.join((c for c in unicodedata

pip, proxy authentication and “Not supported proxy scheme”

廉价感情. 提交于 2019-12-22 04:05:27
问题 Trying to install pip on a new python installation. I am stuck with proxy errors. Looks like a bug in get-pip or urllib3 ?? Question is do I have to go through the pain of setting up CNTLM as described here or is there a shortcut? get-pip.py documentation says use --proxy="[user:passwd@]proxy.server:port" option to specify proxy and relevant authentication. But seems like pip passes on the whole thing as it is to urllib3 which interprets "myusr" as the url scheme, because of the ':' I guess (

Build query string using urlencode python

寵の児 提交于 2019-12-22 04:02:19
问题 I am trying to build a url so that I can send a get request to it using urllib module. Let's suppose my final_url should be url = "www.example.com/find.php?data=http%3A%2F%2Fwww.stackoverflow.com&search=Generate+value" Now to achieve this I tried the following way: >>> initial_url = "http://www.stackoverflow.com" >>> search = "Generate+value" >>> params = {"data":initial_url,"search":search} >>> query_string = urllib.urlencode(params) >>> query_string 'search=Generate%2Bvalue&data=http%3A%2F

get file size before downloading using HTTP header not matching with one retrieved from urlopen

不羁岁月 提交于 2019-12-21 22:29:34
问题 why is the content-lenght different in case of using requests and urlopen(url).info() >>> url = 'http://pymotw.com/2/urllib/index.html' >>> requests.head(url).headers.get('content-length', None) '8176' >>> urllib.urlopen(url).info()['content-length'] '38227' >>> len(requests.get(url).content) 38274 I was going to make a check for size of file in bytes to split the buffer to multiple threads based on Range in urllib2 but if I do not have the actual size of file in bytes it won't work.. only

Connect to FTP server through http proxy

故事扮演 提交于 2019-12-21 22:09:12
问题 My code belove gives me the error: socket.gaierror: [Errno 11001] getaddrinfo failed when calling the method ftp.connect(). My question is: why can I connect to google.com but when connecting to an ftp server it gives me error? And how I can connect to the ftp server from behind http proxy? import ftplib import urllib.request # ftp settings ftpusername = 'abc' ftppassword = 'xyz' ftp_host = 'host' ftp_port = 1234 proxy_url = 'http://username:password@host:port' proxy_support = urllib.request