urlopen | 易学教程

How do I set cookies using Python urlopen?

阅读更多关于 How do I set cookies using Python urlopen?

问题 I am trying to fetch an html site using Python urlopen. I am getting this error: HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop The code: from urllib2 import Request request = Request(url) response = urlopen(request) I understand that the server redirects to another URL and that it is looking for a cookie. How do I set the cookie it is looking for so I can read the html? 回答1: Here's an example from Python documentation, adjusted to

Parsing HTTP Response in Python

阅读更多关于 Parsing HTTP Response in Python

问题 I want to manipulate the information at THIS url. I can successfully open it and read its contents. But what I really want to do is throw out all the stuff I don't want, and to manipulate the stuff I want to keep. Is there a way to convert the string into a dict so I can iterate over it? Or do I just have to parse it as is (str type)? from urllib.request import urlopen url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json' response = urlopen(url) print(response.read()) # returns string

TypeError: urlopen() got multiple values for keyword argument 'body' while executing tests through Selenium and Python on Kubuntu 14.04

阅读更多关于 TypeError: urlopen() got multiple values for keyword argument 'body' while executing tests through Selenium and Python on Kubuntu 14.04

问题 im trying to run a selenium in python on Kubuntu 14.04. I get this error message trying with chromedriver or geckodriver, both same error. Traceback (most recent call last): File "vse.py", line 15, in <module> driver = webdriver.Chrome(chrome_options=options, executable_path=r'/root/Desktop/chromedriver') File "/usr/local/lib/python3.4/dist-packages/selenium/webdriver/chrome/webdriver.py", line 75, in __init__ desired_capabilities=desired_capabilities) File "/usr/local/lib/python3.4/dist

Tell urllib2 to use custom DNS

阅读更多关于 Tell urllib2 to use custom DNS

问题 I'd like to tell urllib2.urlopen (or a custom opener ) to use 127.0.0.1 (or ::1 ) to resolve addresses. I wouldn't change my /etc/resolv.conf , however. One possible solution is to use a tool like dnspython to query addresses and httplib to build a custom url opener. I'd prefer telling urlopen to use a custom nameserver though. Any suggestions? 回答1: Looks like name resolution is ultimately handled by socket.create_connection . -> urllib2.urlopen -> httplib.HTTPConnection -> socket.create

Tell urllib2 to use custom DNS

阅读更多关于 Tell urllib2 to use custom DNS

how to fix beautifulsoup ssl CERTIFICATE_VERIFY_FAILED error

阅读更多关于 how to fix beautifulsoup ssl CERTIFICATE_VERIFY_FAILED error

问题 Code: import requests from bs4 import BeautifulSoup from urllib.request import Request, urlopen html = urlopen("https://www.familyeducation.com/baby-names/browse-origin/surname/german") soup = BeautifulSoup(html) metadata=soup.find_all('meta') Error: urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] 回答1: For this error check out this answer: urllib and "SSL: CERTIFICATE_VERIFY_FAILED" Error But you don't need urlopen for html request always. You can also send the request through requests lib.

Login to a website using python

阅读更多关于 Login to a website using python

问题 I am trying to login to this page using Python.Here is my code from urllib2 import urlopen from bs4 import BeautifulSoup import requests import sys URL= 'http://coe2.annauniv.edu/result/index.php' soup = BeautifulSoup(urlopen(URL)) for hit in soup.findAll(attrs={'class' : 's2'}): print hit.contents[0].strip() RegisterNumber = raw_input("enter the register number") DateofBirth = raw_input("enter the date of birth [DD-MM-YYYY]") login_input = raw_input("enter the what is()?") def main(): #

Why does text retrieved from pages sometimes look like gibberish?

阅读更多关于 Why does text retrieved from pages sometimes look like gibberish?

问题 I'm using urllib and urllib2 in Python to open and read webpages but sometimes, the text I get is unreadable. For example, if I run this: import urllib text = urllib.urlopen('http://tagger.steve.museum/steve/object/141913').read() print text I get some unreadable text. I've read these posts: Gibberish from urlopen Does python urllib2 automatically uncompress gzip data fetched from webpage? but can't seem to find my answer. Thank you in advance for your help! UPDATE: I fixed the problem by

Python urllib freezes with specific URL

阅读更多关于 Python urllib freezes with specific URL

问题 I am trying to fetch a page and urlopen hangs and never returns anything, although the web page is very light and can be opened with any browser without any problems import urllib.request with urllib.request.urlopen("http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm") as response: print(response.read()) This simple code just freezes while retrieving the response, but if you try to open http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm it opens without

python urllib2.urlopen(url) process block

阅读更多关于 python urllib2.urlopen(url) process block

问题 I am using urllib2.urlopen() and my process is getting blocked I am aware that urllib2.urlopen() has default timeout. How to make the call unblockable? The backtrace is (gdb) bt #0 0x0000003c6200dc35 in recv () from /lib64/libpthread.so.0 #1 0x00002b88add08137 in ?? () from /usr/lib64/python2.6/lib-dynload/_socketmodule.so #2 0x00002b88add0830e in ?? () from /usr/lib64/python2.6/lib-dynload/_socketmodule.so #3 0x000000310b2d8e19 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0 回答1