入门
- 发送请求:
r = requests.get("http://httpbin.org/get") # GET
r = requests.post("http://httpbin.org/post") # POST
r = requests.put("http://httpbin.org/put") # PUT
r = requests.delete("http://httpbin.org/delete") # DELETE
r = requests.head("http://httpbin.org/get") # HEAD
r = requests.options("http://httpbin.org/get") # OPTIONS
- URL参数
>>> payload = {'key1': 'value1', 'key2': 'value2'}
>>> r = requests.get("http://httpbin.org/get", params=payload)
>>> print(r.url)
http://httpbin.org/get?key2=value2&key1=value1
- 响应内容
>>> r = requests.get("http://httpbin.org/get")
>>> type(r.text) # 字符串
str
>>> type(r.content) # 字节码
bytes
>>> r.json() # json格式
>>> r.encoding # 编码
>>> r.headers # 头
- 定制请求头
>>> import json
>>> payload = {'key1': 'value1', 'key2': 'value2'}
>>> r = requests.post("http://httpbin.org/post", data=payload)
>>> url = 'https://api.github.com/some/endpoint'
>>> r = requests.post(url, data=json.dumps(payload))
>>> payload = {'some': 'data'}
>>> headers = {'content-type': 'application/json'}
>>> r = requests.post(url, data=json.dumps(payload), headers=headers)
- 状态码
>>> r = requests.get('http://httpbin.org/get')
>>> r.status_code
200
>>> r.ok
True
- cookie
>>> url = 'http://httpbin.org/cookies'
>>> cookies = dict(cookies_are='working')
>>> r = requests.get(url, cookies=cookies)
- 重定向, 超时
除了HEAD, 自动重定向
>>> r.history
>>> r = requests.head('http://github.com', allow_redirects=True)
>>> requests.get('http://github.com', timeout=0.001) #timeout 仅对连接过程有效,与响应体的下载无关。 timeout 并不是整个下载响应的时间限制
- 异常 遇到网络问题(如:DNS查询失败、拒绝连接等)时,Requests会抛出一个 ConnectionError 异常。
遇到罕见的无效HTTP响应时,Requests则会抛出一个 HTTPError 异常。
若请求超时,则抛出一个 Timeout 异常。
若请求超过了设定的最大重定向次数,则会抛出一个 TooManyRedirects 异常。
所有Requests显式抛出的异常都继承自 requests.exceptions.RequestException
进阶
- 会话
s = requests.Session()
s.auth = ('user', 'pass')
s.headers.update({'x-test': 'true'})
#'x-test' 和 'x-test2' 都会被发送
s.get('http://httpbin.org/headers', headers={'x-test2': 'true'})
任何你传递给请求方法的字典都会与已设置会话层数据合并。方法层的参数覆盖会话的参数。
- 定制请求
from requests import Request, Session
s = Session()
req = Request('GET', url,
data=data,
headers=header
)
prepped = req.prepare()
#do something with prepped.body
#do something with prepped.headers
resp = s.send(prepped,
stream=stream,
verify=verify,
proxies=proxies,
cert=cert,
timeout=timeout
)
定制会话请求, 比如带cookie:
from requests import Request, Session
s = Session()
req = Request('GET', url,
data=data
headers=headers
)
prepped = s.prepare_request(req)
#do something with prepped.body
#do something with prepped.headers
resp = s.send(prepped,
stream=stream,
verify=verify,
proxies=proxies,
cert=cert,
timeout=timeout
)
- SSL证书验证
>>> requests.get('https://github.com', verify=True)
- 响应体内容工作流
默认情况下,当你进行网络请求后,响应体会立即被下载。你可以通过 stream 参数覆盖这个行为,推迟下载响应体直到访问 Response.content 属性:
tarball_url = 'https://github.com/kennethreitz/requests/tarball/master'
r = requests.get(tarball_url, stream=True)
此时仅有响应头被下载下来了,连接保持打开状态,因此允许我们根据条件获取内容:
if int(r.headers['content-length']) < TOO_LONG:
content = r.content
...
你可以进一步使用 Response.iter_content 和 Response.iter_lines 方法来控制工作流,或者以 Response.raw 从底层urllib3的 urllib3.HTTPResponse <urllib3.response.HTTPResponse 读取
import json
import requests
r = requests.get('http://httpbin.org/stream/20', stream=True)
for line in r.iter_lines():
\# filter out keep-alive new lines
if line:
print(json.loads(line))
连接只有在响应体被完全读取后才会被释放, 若部分读取然后释放连接,可用上下文管理:
from contextlib import closing
with closing(requests.get('http://httpbin.org/get', stream=True)) as r:
\# Do things with the response here.
- 事件挂钩
callback_function 会接受一个数据块作为它的第一个参数
def print_url(r):
print(r.url)
>>> requests.get('http://httpbin.org', hooks=dict(response=print_url))
http://httpbin.org
<Response [200]>
- 代理
import requests
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
requests.get("http://example.org", proxies=proxies)
使用系统代理
$ export HTTP_PROXY="http://10.10.1.10:3128"
$ export HTTPS_PROXY="http://10.10.1.10:1080"
$ python
>>> import requests
>>> requests.get("http://example.org")
若你的代理需要使用HTTP Basic Auth,可以使用 http://user:password@host/ 语法:
proxies = {
"http": "http://user:pass@10.10.1.10:3128/",
}
- 身份认证
Requests简化了多种身份验证形式的使用, 包括非常常见的Basic Auth
>>> from requests.auth import HTTPBasicAuth
>>> requests.get('https://api.github.com/user', auth=HTTPBasicAuth('user', 'pass'))
<Response [200]>
Requests提供了一种简写的使用方式:
>>> requests.get('https://api.github.com/user', auth=('user', 'pass'))
<Response [200]>
摘要式身份认证:
>>> from requests.auth import HTTPDigestAuth
>>> url = 'http://httpbin.org/digest-auth/auth/user/pass'
>>> requests.get(url, auth=HTTPDigestAuth('user', 'pass'))
<Response [200]>
来源:oschina
链接:https://my.oschina.net/u/347219/blog/659242