Using arguments in scrapy pipeline on __init__

后端 未结 2 1901
滥情空心
滥情空心 2020-12-31 21:25

i have a scrapy pipelines.py and i want to get the given arguments. In my spider.py it works perfect:

class MySpider( CrawlSpider ):
    def __init__(self, h         


        
相关标签:
2条回答
  • 2020-12-31 22:13

    I may be too late to provide a useful answer to op but for anybody reaching this question in the future (as I did), you should check the classmethods from_crawler and/or from_settings.

    This way you can pass your arguments the way you want.

    Check: https://doc.scrapy.org/en/latest/topics/item-pipeline.html#from_crawler

    from_crawler(cls, crawler)

    If present, this classmethod is called to create a pipeline instance from a Crawler. It must return a new instance of the pipeline. Crawler object provides access to all Scrapy core components like settings and signals; it is a way for pipeline to access them and hook its functionality into Scrapy.

    Parameters: crawler (Crawler` object) – crawler that uses this pipeline

    0 讨论(0)
  • 2020-12-31 22:19

    Set the arguments inside the spider's constructor:

    class MySpider(CrawlSpider):
        def __init__(self, user_id='', *args, **kwargs):
            self.user_id = user_id
    
            super(MySpider, self).__init__(*args, **kwargs) 
    

    And read them in the open_spider() method of your pipeline:

    def open_spider(self, spider):
        print spider.user_id
    
    0 讨论(0)
提交回复
热议问题