HTTPS log in with urllib2

后端未结

关注

 3  1287

难免孤独

I currently have a little script that downloads a webpage and extracts some data I\'m interested in. Nothing fancy.

Currently I\'m downloading the page like so:

相关标签:

3条回答

后悔当初

2021-02-09 22:54

The requests module provides a modern API to HTTP/HTTPS capabilities.

import requests

url = 'https://www.someserver.com/toplevelurl/somepage.htm'

res = requests.get(url, auth=('USER', 'PASSWORD'))

status = res.status_code
text   = res.text

0 讨论(0)

渐次进展

2021-02-09 23:13

this says, it should be straight forward

[as] long as your local Python has SSL support.

If you use just HTTP Basic Authentication, you must set different handler, as described here.

Quoting the example there:

import urllib2

theurl = 'http://www.someserver.com/toplevelurl/somepage.htm'
username = 'johnny'
password = 'XXXXXX'
# a great password

passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
# this creates a password manager
passman.add_password(None, theurl, username, password)
# because we have put None at the start it will always
# use this username/password combination for  urls
# for which `theurl` is a super-url

authhandler = urllib2.HTTPBasicAuthHandler(passman)
# create the AuthHandler

opener = urllib2.build_opener(authhandler)

urllib2.install_opener(opener)
# All calls to urllib2.urlopen will now use our handler
# Make sure not to include the protocol in with the URL, or
# HTTPPasswordMgrWithDefaultRealm will be very confused.
# You must (of course) use it when fetching the page though.

pagehandle = urllib2.urlopen(theurl)
# authentication is now handled automatically for us

If you do Digest, you'll have to set some additional headers, but they are the same regardless of SSL usage. Google for python+urllib2+http+digest.

Cheers,

0 讨论(0)

青春惊慌失措

2021-02-09 23:14

The urllib2 documentation has an example of working with Basic Authentication:

http://docs.python.org/library/urllib2.html#examples

0 讨论(0)
发布评论:

提交评论
- 加载中...