问题
I want to create my own service for scrapyd API, which should return a little more information about running crawler. I get stuck at very beginning: where I should place the module, which will contain that service. If we look at default "scrapyd.conf" it's has a section called services:
[services]
schedule.json = scrapyd.webservice.Schedule
cancel.json = scrapyd.webservice.Cancel
addversion.json = scrapyd.webservice.AddVersion
listprojects.json = scrapyd.webservice.ListProjects
listversions.json = scrapyd.webservice.ListVersions
listspiders.json = scrapyd.webservice.ListSpiders
delproject.json = scrapyd.webservice.DeleteProject
delversion.json = scrapyd.webservice.DeleteVersion
listjobs.json = scrapyd.webservice.ListJobs
so this is the absolute paths to each service in scrapyd package, which placed in dist-packages folder. Is there any way to place my own module, containing service not in dist-packages folder?
upd. Realized that question may be unclear. Scrapy is a framework for parsing data from websites. I have a simple django site from where I can start/stop crawlers for specific region etc (http://54.186.79.236 it's in russian). Manipulating with crawlers occurs through scrapyd API. In default it has a little API's only for start/stop/list crawlers and their logs etc. This APIs are listed in this doc's http://scrapyd.readthedocs.org/en/latest/api.html So above was a little intro, to the question now. I want extend existing API to retrieve more info from running crawler and render it in my website mentioned above. For this I need inherit existing scrapyd.webservice.WsResource and write a service. Its ok with that part if I place that service module in one of 'sys.path' paths. But I want to keep this service containing module in scrapy project folder (for some aesthetic reason). So if I keep it there it argues(predictably) 'No module named' on scrapyd launch.
回答1:
So, I solve my problem according to this.
来源:https://stackoverflow.com/questions/28950416/implementing-own-scrapyd-service