Requests - get content-type/size without fetching the whole page/content

后端未结

关注

 4  1189

悲&欢浪女 2021-02-07 12:35

I have a simple website crawler, it works fine, but sometime it stuck because of large content such as ISO images, .exe files and other large stuff. Guessing content-type using

4条回答

不思量自难忘° (楼主)

2021-02-07 13:06
Because requests.head() does NOT auto redirect, so a URL is redirected, requests.head() will get 0 for Content-Length. So make sure allow_redirects=True is added.
```
r = requests.head(url, allow_redirects=True)
length = r.headers['Content-Length']
```
Refer to Requests Redirection And History
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...