scrapyd

Running multiple spiders using scrapyd

会有一股神秘感。 提交于 2020-01-02 07:24:07
问题 I had multiple spiders in my project so decided to run them by uploading to scrapyd server. I had uploaded my project succesfully and i can see all the spiders when i run the command curl http://localhost:6800/listspiders.json?project=myproject when i run the following command curl http://localhost:6800/schedule.json -d project=myproject -d spider=spider2 Only one spider runs because of only one spider given, but i want to run run multiple spiders here so the following command is right for

Why does scrapyd throw: “'FeedExporter' object has no attribute 'slot'” exception?

雨燕双飞 提交于 2020-01-02 06:24:18
问题 I came across a situation where my scrapy code is working fine when used from command line but when I'm using the same spider after deploying (scrapy-deploy) and scheduling with scrapyd api it throws error in "scrapy.extensions.feedexport.FeedExporter" class. one is while initializing "open_spider" signal second is while initializing "item_scraped" signal and last while "close_spider" signal 1."open_spider" signal error 2016-05-14 12:09:38 [scrapy] INFO: Spider opened 2016-05-14 12:09:38

error in deploying a project using scrapyd

╄→尐↘猪︶ㄣ 提交于 2020-01-02 02:31:46
问题 I had multiple spiders in my project folder and want to run all the spiders at once, so i decided to run them using scrapyd service. I have started doing this by seeing here First of all i am in current project folder I had opened the scrapy.cfg file and uncommented the url line after [deploy] I had run scrapy server command, that works fine and scrapyd server runs I tried this command scrapy deploy -l Result : default http://localhost:6800/ when i tried this command scrapy deploy -L scrapyd

scrapyd-client command not found

风流意气都作罢 提交于 2020-01-01 05:31:25
问题 I'd just installed the scrapyd-client(1.1.0) in a virtualenv, and run command 'scrapyd-deploy' successfully, but when I run 'scrapyd-client', the terminal said: command not found: scrapyd-client. According to the readme file(https://github.com/scrapy/scrapyd-client), there should be a 'scrapyd-client' command. I had checked the path '/lib/python2.7/site-packages/scrapyd-client', only 'scrapyd-deploy' in the folder. Is the command 'scrapyd-client' being removed for now? 回答1: Create a fresh

Proper way to run multiple scrapy spiders

为君一笑 提交于 2019-12-23 16:56:46
问题 I just tried running multiple spiders in the same process using the new scrapy documentation but I am getting: AttributeError: 'CrawlerProcess' object has no attribute 'crawl' I found this SO post with the same problem so I tried using the code from the 0.24 documentation and got: runspider: error: Unable to load 'price_comparator.py': No module named testspiders.spiders.followall For 1.0 I imported: from scrapy.crawler import CrawlerProcess and for 0.24 I imported: from twisted.internet

Error when deploying scrapy project on the scrapy cloud

喜你入骨 提交于 2019-12-23 06:44:00
问题 I am using scrapy 0.20 on Python 2.7. I want to deploy my scrapy project on scrapy cloud I developed my scrapy project with simple spider. navigate to my scrapy project folder. typed scrapy deploy scrapyd -d koooraspider on cmd. Where koooraspider is my project's name, and scrapyd is my target. I got the following error: D:\Walid-Project\Tasks\koooraspider>scrapy deploy scrapyd -p koooraspider Packing version 1395847344 Deploying to project "koooraspider" in http://dash.scrapinghub.com/api

Error when deploying scrapy project on the scrapy cloud

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-23 06:42:21
问题 I am using scrapy 0.20 on Python 2.7. I want to deploy my scrapy project on scrapy cloud I developed my scrapy project with simple spider. navigate to my scrapy project folder. typed scrapy deploy scrapyd -d koooraspider on cmd. Where koooraspider is my project's name, and scrapyd is my target. I got the following error: D:\Walid-Project\Tasks\koooraspider>scrapy deploy scrapyd -p koooraspider Packing version 1395847344 Deploying to project "koooraspider" in http://dash.scrapinghub.com/api

Preferred way to run Scrapyd in the background / as a service

允我心安 提交于 2019-12-22 09:47:02
问题 I am trying to run Scrapyd on a virtual Ubuntu 16.04 server, to which I connect via SSH. When I run scrapy by simply running $ scrapyd I can connect to the web interface by going to http://82.165.102.18:6800. However, once I close the SSH connection, the web interface is no longer available, therefore, I think I need to run Scrapyd in the background as a service somehow. After some research I came across a few proposed solutions: daemon (sudo apt install daemon) screen (sudo apt install

How to password protect Scrapyd UI?

和自甴很熟 提交于 2019-12-22 00:49:47
问题 I have my website available to public and there is Scrapyd running at port 6800 like http://website.com:6800/ I do not want anyone to see list of my crawlers. I know anyone can easily guess type up port 6800 and can see whats going on. I have few questions, answer any of them will help me. Is there way to password protect Scrapyd UI? Can I password protect a specific Port on Linux? I know it can be done with IPTables to ONLY ALLOW PARTICULAR IPs but thats not a good solution Should I make

windows scrapyd-deploy is not recognized

别等时光非礼了梦想. 提交于 2019-12-18 12:42:11
问题 I have install the scrapyd like this pip install scrapyd I want to use scrapyd-deploy when i type scrapyd i got this exception in cmd: 'scrapyd' is not recognized as an internal or external command, operable program or batch file. 回答1: I ran into the same issue, and I also read some opinions that scrapyd isn't available / can't run on windows and nearly gave it up (didn't really need it as I intend on deploying to a linux machine, wanted scrapyd on windows for debug purposes). However, after