scrapinghub

unable to scrape myntra API data using scrapy framework 307 redirect error

被刻印的时光 ゝ 提交于 2019-12-13 10:07:15
问题 Below is the spider code: import scrapy class MyntraSpider(scrapy.Spider): custom_settings = { 'HTTPCACHE_ENABLED': False, 'dont_redirect': True, #'handle_httpstatus_list' : [302,307], #'CRAWLERA_ENABLED': False, 'USER_AGENT': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36', } name = "heytest" allowed_domains = ["www.myntra.com"] start_urls = ["https://www.myntra.com/web/v2/search/data/duke"] def parse(self, response): self

How to use peewee with scrapinghub

醉酒当歌 提交于 2019-12-12 04:36:51
问题 I want to save my data to remote machine by using peewee. When i run my crawler i found following error, File "/usr/local/lib/python2.7/site-packages/scrapy/commands/crawl.py", line 57, in run self.crawler_process.crawl(spname, **opts.spargs) File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 163, in crawl return self._crawl(crawler, *args, **kwargs) File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 167, in _crawl d = crawler.crawl(*args, **kwargs) File

Dependency error while trying to run project on Scrapy Cloud

不羁的心 提交于 2019-12-11 15:19:49
问题 I create a project with scrapy and using pymongo save my data to mongodb . I have checked my pymongo version is 3.5.1 When i deploy my project to scrapinghub and run it. It shows error on scrapinghub exceptions.ImportError: No module named pymongo I have created requirements.txt and scrapinghub.yml. Why it shows error exceptions.ImportError: No module named pymongo ? Any help would be appreciated. Thanks in advance. 回答1: You have the format of requirements.txt which did not work for me too.

Set variable on shub deploy project

社会主义新天地 提交于 2019-12-06 12:33:21
问题 I'm trying to setup scrapy settings to work with test and production environment on local and also on scrapinghub. And I would like to know if there is any way to set this variable (for example as the following) on shub deploy: And then at settings.py: if env == "test": var1 = some_ip var2 = username elif env == "prod": var1 = some_ip var2 = username Or... maybe there is a cleaner way to this? Thank you for reading! PS: I want to automate the settings depending of the environment where is the