I want to open and read https://yande.re/ with urllib.request
, but I'm getting an SSL error. I can open and read the page just fine using http.client
with this code:
import http.client
conn = http.client.HTTPSConnection('www.yande.re')
conn.request('GET', 'https://yande.re/')
resp = conn.getresponse()
data = resp.read()
However, the following code using urllib.request
fails:
import urllib.request
opener = urllib.request.build_opener()
resp = opener.open('https://yande.re/')
data = resp.read()
It gives me the following error: ssl.SSLError: [Errno 1] _ssl.c:392: error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid ecpointformat list
. Why can I open the page with HTTPSConnection but not opener.open?
Edit: Here's my OpenSSL version and the traceback from trying to open https://yande.re/
>>> import ssl; ssl.OPENSSL_VERSION
'OpenSSL 1.0.0a 1 Jun 2010'
>>> import urllib.request
>>> urllib.request.urlopen('https://yande.re/')
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
urllib.request.urlopen('https://yande.re/')
File "C:\Python32\lib\urllib\request.py", line 138, in urlopen
return opener.open(url, data, timeout)
File "C:\Python32\lib\urllib\request.py", line 369, in open
response = self._open(req, data)
File "C:\Python32\lib\urllib\request.py", line 387, in _open
'_open', req)
File "C:\Python32\lib\urllib\request.py", line 347, in _call_chain
result = func(*args)
File "C:\Python32\lib\urllib\request.py", line 1171, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Python32\lib\urllib\request.py", line 1138, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 1] _ssl.c:392: error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid ecpointformat list>
>>>
What a coincidence! I'm having the same problem as you are, with an added complication: I'm behind a proxy. I found this bug report regarding https-not-working-with-urllib. Luckily, they posted a workaround.
import urllib.request
import ssl
##uncomment this code if you're behind a proxy
##https port is 443 but it doesn't work for me, used port 80 instead
##proxy_auth = '{0}://{1}:{2}@{3}'.format('https', 'username', 'password',
## 'proxy:80')
##proxies = { 'https' : proxy_auth }
##proxy = urllib.request.ProxyHandler(proxies)
##proxy_auth_handler = urllib.request.HTTPBasicAuthHandler()
##opener = urllib.request.build_opener(proxy, proxy_auth_handler,
## https_sslv3_handler)
https_sslv3_handler =
urllib.request.HTTPSHandler(context=ssl.SSLContext(ssl.PROTOCOL_SSLv3))
opener = urllib.request.build_opener(https_sslv3_handler)
urllib.request.install_opener(opener)
resp = opener.open('https://yande.re/')
data = resp.read().decode('utf-8')
print(data)
Btw, thanks for showing how to use http.client
. I didn't know that there's another library that can be used to connect to the internet. ;)
This is due to a bug in the early 1.x OpenSSL implementation of elliptic curve cryptography. Take a closer look at the relevant part of the exception:
_ssl.c:392: error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid ecpointformat list
This is an error from the underlying OpenSSL library code which is a result of mishandling the EC point format TLS extension. One workaround is to use the SSLv3 instead of SSLv23 method, the other workaround is to use a cipher suite specification which disables all ECC cipher suites (I had good results with ALL:-ECDH
, use openssl ciphers
for testing). The fix is to update OpenSSL.
The problem is due to the hostnames that your giving in the two examples:
import http.client
conn = http.client.HTTPSConnection('www.yande.re')
conn.request('GET', 'https://yande.re/')
and...
import urllib.request
urllib.request.urlopen('https://yande.re/')
Note that in the first example, you're asking the client to make a connection to the host: www.yande.re and in the second example, urllib will first parse the url 'https://yande.re' and then try a request at the host yande.re
Although www.yande.re and yande.re may resolve to the same IP address, from the perspective of the web server these are different virtual hosts. My guess is that you had an SNI configuration problem on your web server's side. Seeing as that the original question was posted on May 21, and the current cert at yande.re starts May 28, I'm thinking that you already fixed this problem?
Try this:
import connection #imports connection
import url
url = 'http://www.google.com/'
webpage = url.open(url)
try:
connection.receive(webpage)
except:
webpage = url.text('This webpage is not available!')
connection.receive(webpage)
来源:https://stackoverflow.com/questions/10678695/in-python-3-2-i-can-open-and-read-an-https-web-page-with-http-client-but-urlli