问题
I am trying to web scrape, here is my code.
For some reason I am getting HTTP Error 400: Bad Request, I have never had this before.
Any ideas?
Here is my code:
import urllib.request
import re
url = ('https://www.myvue.com/whats-on')
req = urllib.request.Request(url, headers={'User Agent': 'Mozilla/5.0'})
def main():
html_page = urllib.request.urlopen(req).read()
content=html_page.decode(errors='ignore', encoding='utf-8')
headings = re.findall('<th scope="col" abbr="(.*?)">', content)
print(headings)
main()
回答1:
Fix your header:
req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
It's User-Agent
, not User Agent
.
Additionally, I would recommend switching over to the requests module.
import requests
html_page = requests.get(url, {'User-Agent': 'Mozilla/5.0'}).text
This is the equivalent of three lines of urllib
and much more readable. In addition, it automatically decodes the content for you.
来源:https://stackoverflow.com/questions/45058583/how-do-i-fix-a-http-error-400-bad-request