PYTHON SCRAPY Can't POST information to FORMS,

前端 未结 1 401
一向
一向 2021-02-04 16:22

I think that I will ask very big favor as i struggling with this problem several days. I tried all possible (in my best knowledge) ways and still no result. I am doing somethin

1条回答
  •  闹比i
    闹比i (楼主)
    2021-02-04 17:15

    Here's a working example of using Request.from_response for delta.com:

    from scrapy.item import Item, Field
    from scrapy.http import FormRequest
    from scrapy.spider import BaseSpider
    
    
    class DeltaItem(Item):
        title = Field()
        link = Field()
        desc = Field()
    
    
    class DmozSpider(BaseSpider):
        name = "delta"
        allowed_domains = ["delta.com"]
        start_urls = ["http://www.delta.com"]
    
        def parse(self, response):
            yield FormRequest.from_response(response,
                                            formname='flightSearchForm',
                                            formdata={'departureCity[0]': 'JFK',
                                                      'destinationCity[0]': 'SFO',
                                                      'departureDate[0]': '07.20.2013',
                                                      'departureDate[1]': '07.28.2013'},
                                            callback=self.parse1)
    
        def parse1(self, response):
            print response.status
    

    You've used wrong spider methods, plus allowed_domains was incorrectly set.

    But, anyway, delta.com heavily uses dynamic ajax calls for loading the content - here's where your problems start. E.g. response in parse1 method doesn't contain any search results - instead it contains an html for loading AWAY WE GO. ARRIVING AT YOUR FLIGHTS SOON page where results are loaded dynamically.

    Basically, you should work with your browser developer tools and try to simulate those ajax calls inside your spider or use tools like selenium which uses the real browser (and you can combine it with scrapy).

    See also:

    • Scraping ajax pages using python
    • Can scrapy be used to scrape dynamic content from websites that are using AJAX?
    • Pagination using scrapy

    Hope that helps.

    0 讨论(0)
提交回复
热议问题