I developed few spiders in scrapy & I want to test those on Heroku cloud. Does anybody have any idea about how to deploy a Scrapy spider on Heroku cloud?
Yes, it's fairly simple to deploy and run your Scrapy spider on Heroku.
Here are the steps using a real Scrapy project as example:
Clone the project (note that it must have a requirements.txt
file for Heroku to recognize it as a Python project):
git clone https://github.com/scrapinghub/testspiders.git
Add cffi to the requirement.txt file (e.g. cffi==1.1.0).
Create the Heroku application (this will add a new heroku git remote):
heroku create
Deploy the project (this will take a while the first time, when the slug is built):
git push heroku master
Run your spider:
heroku run scrapy crawl followall
Some notes:
-o s3://mybucket/items.jl
) or use an addon (like MongoHQ or Redis To Go) and write a pipeline to store your items theresqlite3
module (which Scrapyd requires) doesn't work on Heroku