Log in not working using scrapy

陌路散爱 提交于 2019-12-12 10:26:59

问题


I have written scrapy code for log in to a site. first i tried for one site. It worked well. But then i changed the url and tried for other site. It is not working for that site. I used the same code without any change. What would be the problem?

        # -*- coding: utf-8 -*-
import scrapy
from scrapy.http import FormRequest
from scrapy.utils.response import open_in_browser

class QuoteSpider(scrapy.Spider):
    name = 'Quote'
    allowed_domains = ["quotes.toscrape.com"]
    start_urls = (
        'http://quotes.toscrape.com/login',
    )

    def parse(self, response):
        token=response.xpath('//input[@name="csrf_token"]/@value').extract_first()

        return FormRequest.from_response(response,formdata={'csrf_token':token,'password':'foo','username':'foo'},callback=self.scape_home_page)

    def scape_home_page(self, response):
        open_in_browser(response)

This worked well.

    # -*- coding: utf-8 -*-
import scrapy
from scrapy.http import FormRequest
from scrapy.utils.response import open_in_browser

class BucketsSpider(scrapy.Spider):
    name = 'buckets'
    allowed_domains = ['http://collegekart.in/login']
    start_urls = ['http://collegekart.in/login/']

    def parse(self, response):
        token=response.xpath('//meta[@name="csrf-token"]/@content').extract_first()
        print(token)
        return FormRequest.from_response(response,formdata={'csrf-token':token,'password':'*********','username':'**************'},callback=self.scape_home_page)

    def scape_home_page(self, response):
        open_in_browser(response)
        print("yes")

This is not working. Please help to solve this.


回答1:


What's wrong

  • `........from_response(response........
    • if you check the response.url, it will give you http://collegekart.in/login/ instead of http://collegekart.in/
  • allowed_domains =['http://collegekart.in/login']
    • the login GET Request of collegekart.in/ is not in your allowed_domains

How to fix it

# -*- coding: utf-8 -*-
import scrapy
from scrapy.http import FormRequest
from scrapy.utils.response import open_in_browser

class BucketsSpider(scrapy.Spider):
    name = 'buckets'
    allowed_domains = ['collegekart.in']
    start_urls = ['http://collegekart.in/login/']

    def parse(self, response):
        token=response.xpath('//meta[@name="csrf-token"]/@content').extract_first()
        print(token)
        response = response.replace(url='http://collegekart.in/')
        return FormRequest.from_response(response,formdata={'csrf-token':token, 'password':'hanfenghanfeng','username':'zerqqr1@iydhp.com'},callback=self.scape_home_page)

    def scape_home_page(self, response):
        open_in_browser(response)
        print("yes")

Why?

  • If you didn't replace the url variable in response:

    • scrapy will send your request to an incorrect url: http://collegekart.in/login/access/attempt_login?utf8=%E2%9C%93&username=zerqqr1%40iydhp.com&password=hanfenghanfeng

    • This is the correct url: http://collegekart.in/access/attempt_login?utf8=%E2%9C%93&username=zerqqr1%40iydhp.com&password=hanfenghanfeng

  • Login GET url is not included in allowed_domains

    • allowed_domains = ['http://collegekart.in/login']
    • Login GET url: http://collegekart.in/access/.......

Suggestions

  • Use Chrome's Inspector > Network to see the actual request being made when performing Login actions

  • Check this scrapy official tutorial (PDF Version): Link




回答2:


Here change the response url accordingly, this wil solve the problem.



来源:https://stackoverflow.com/questions/47259090/log-in-not-working-using-scrapy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!