Scraping a website using Scrapy and selenium

前端 未结 1 1270
攒了一身酷
攒了一身酷 2021-01-27 09:06

I am going to scrape html contents on http://ntry.com/#/scores/named_ladder/main.php with Scrapy.

But, because of the site\'s Javascript use

相关标签:
1条回答
  • 2021-01-27 09:45

    I installed Selenium and then loaded PhantomJS module and it worked perfectly.

    Here is what you can try

    from selenium import webdriver 
    from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
    
    class FormSpider(Spider):
        name = "form"
    
        def __init__(self):
    
            dcap = dict(DesiredCapabilities.PHANTOMJS)
            dcap["phantomjs.page.settings.userAgent"] = ("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.122 Safari/537.36")
    
            self.driver = webdriver.PhantomJS(desired_capabilities=dcap, service_args=['--ignore-ssl-errors=true', '--ssl-protocol=any', '--web-security=false'])
            self.driver.set_window_size(1366,768)
    
    
        def parse_page(self, response):
                self.driver.get(response.url)
                cookies_list = self.driver.get_cookies()
    
    0 讨论(0)
提交回复
热议问题