PYTHON SCRAPY Can't POST information to FORMS,

前端未结

关注

 1  403

I think that I will ask very big favor as i struggling with this problem several days. I tried all possible (in my best knowledge) ways and still no result. I am doing somethin

相关标签:

1条回答

闹比i

2021-02-04 17:15

Here's a working example of using Request.from_response for delta.com:

from scrapy.item import Item, Field
from scrapy.http import FormRequest
from scrapy.spider import BaseSpider


class DeltaItem(Item):
    title = Field()
    link = Field()
    desc = Field()


class DmozSpider(BaseSpider):
    name = "delta"
    allowed_domains = ["delta.com"]
    start_urls = ["http://www.delta.com"]

    def parse(self, response):
        yield FormRequest.from_response(response,
                                        formname='flightSearchForm',
                                        formdata={'departureCity[0]': 'JFK',
                                                  'destinationCity[0]': 'SFO',
                                                  'departureDate[0]': '07.20.2013',
                                                  'departureDate[1]': '07.28.2013'},
                                        callback=self.parse1)

    def parse1(self, response):
        print response.status

You've used wrong spider methods, plus allowed_domains was incorrectly set.

But, anyway, delta.com heavily uses dynamic ajax calls for loading the content - here's where your problems start. E.g. response in parse1 method doesn't contain any search results - instead it contains an html for loading AWAY WE GO. ARRIVING AT YOUR FLIGHTS SOON page where results are loaded dynamically.

Basically, you should work with your browser developer tools and try to simulate those ajax calls inside your spider or use tools like selenium which uses the real browser (and you can combine it with scrapy).