问题
I'm trying to log into a website that uses Google credentials. This fails in my scrapy spider:
def parse(self, response):
return scrapy.FormRequest.from_response(
response,
formdata={'email': self.var.user, 'password': self.var.password},
callback=self.after_login)
Any tips?
回答1:
After further inspection I managed to solve this, seems to be, a simple issue:
- The fields are
Email
andPasswd
, in that order. - Break the log in into two request, the first for email, second for password.
The code that works, as follows:
def parse(self, response): """ Insert the email. Next, go to the password page. """ return scrapy.FormRequest.from_response( response, formdata={'Email': self.var.user}, callback=self.log_password) def log_password(self, response): """ Enter the password to complete the log in. """ return scrapy.FormRequest.from_response( response, formdata={'Passwd': self.var.password}, callback=self.after_login)
来源:https://stackoverflow.com/questions/34381055/fetch-pages-with-scrapy-behind-google-authentication