suppress Scrapy Item printed in logs after pipeline

后端未结

关注

 8  569

I have a scrapy project where the item that ultimately enters my pipeline is relatively large and stores lots of metadata and content. Everything is working properly in my s

相关标签:

8条回答

暗喜

2020-12-25 13:09

If you want to exclude only some attributes of the output, you can extend the answer given by @dino

from scrapy.item import Item, Field
import json

class MyItem(Item):
    attr1 = Field()
    attr2 = Field()
    attr1ToExclude = Field()
    attr2ToExclude = Field()
    # ...
    attrN = Field()

    def __repr__(self):
        r = {}
        for attr, value in self.__dict__['_values'].iteritems():
            if attr not in ['attr1ToExclude', 'attr2ToExclude']:
                r[attr] = value
        return json.dumps(r, sort_keys=True, indent=4, separators=(',', ': '))

0 讨论(0)

渐次进展

2020-12-25 13:09
We use the following sample in production:
```
import logging

logging.getLogger('scrapy.core.scraper').addFilter(
    lambda x: not x.getMessage().startswith('Scraped from'))
```
This is a very simple and working code. We add this code in __init__.py in module with spiders. In this case this code automatically run with command like scrapy crawl <spider_name> for all spiders.
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2