How to pass arguments (for FEED_URI) to Scrapy spider's instane for dynamically naming output file

前端 未结 1 687
执念已碎
执念已碎 2021-01-15 06:16

I want to send arguments to spider and get output (json, csv) named accordingly to arguments.
F.e.,
$ scrapy crawl spider_name -a category=category1 -a subcategory=

相关标签:
1条回答
  • 2021-01-15 06:35

    You can get those parameters from kwargs of __init__ and use in FEED_URI like this:

    class MySpider(scrapy.Spider):
        name = 'my_spider'
    
        custom_settings = {
          'FEED_URI' : '%(category)s_%(subcategory)s.json'
         }
    
        def __init__(self, *args, **kwargs):
            self.category = kwargs.pop('category', '')
            self.subcategory = kwargs.pop('subcategory', '')
            super(MySpider, self).__init__(*args, **kwargs)
            
    

    Docs: https://doc.scrapy.org/en/latest/topics/feed-exports.html#storage-uri-parameters

    0 讨论(0)
提交回复
热议问题