Scrapy: Default values for items & fields. What is the best implementation?

后端 未结 2 543
臣服心动
臣服心动 2021-01-18 12:06

As far as I could find out from the documentation and various discussions on the net, the ability to add default values to fields in a scrapy item has been removed.

相关标签:
2条回答
  • 2021-01-18 12:26

    As of version 2.2 Scrapy supports dataclasses. So basically you could define an item this way:

    from dataclasses import dataclass
    
    @dataclass
    class MyPersonalItem:
        name: str = "default name
        age: int = 100
        address: str = "anywhere in the world"
    

    Then when you create your item you can assign the corresponding values with the dot notation:

    victor = MyPersonalItem()
    victor.name = "Victor Herasme"
    victor.age = 42
    victor.address = "Spain
    

    Check Scrapy docs on the subject here: https://docs.scrapy.org/en/latest/topics/items.html#dataclass-objects

    And also this very good tutorial on DataClasses: https://realpython.com/python-data-classes/

    As you see you can use type hints and default values.

    0 讨论(0)
  • 2021-01-18 12:35

    figured out what the problem was. the pipeline is working (code follows for other people's reference). my problem was, that I am appending values to a field. and I wanted the default method work on one of these listvalues... chose a different way and it works. I am now implementing it with a custom setDefault processor method.

    class DefaultItemPipeline(object):
    
    def process_item(self, item, spider):
        item.setdefault('amz_VendorsShippingDurationFrom', 'default')
        item.setdefault('amz_VendorsShippingDurationTo', 'default')
        # ...
        return item
    
    0 讨论(0)
提交回复
热议问题