I have this object I\'m trying to populate with an itemLoader:
{
\"domains\": \"string\",
\"date_insert\": \"2016-12-23T11:25:00.213Z\",
\"title\": \"
Thanks to @eLRuLL I manage to find a decent solution :
items.py :
class StatsItem(scrapy.Item):
views_count=scrapy.Field()
comments_count=scrapy.Field()
class ArticleItem(scrapy.Item):
[...]
stats=scrapy.Field(
input_processor=Identity())
class StatsItemLoader(ItemLoader):
default_input_processor=MapCompose(remove_tags)
default_output_processor=TakeFirst()
default_item_class=StatsItem
spider.py:
def parse(self, response):
[...]
loader.add_value('stats', self.getStats(response))
[...]
def getStats(self, response):
statsLoader = StatsItemLoader(response=response)
statsLoader.add_xpath('comments_count', '//div[@class=\'btn-count\']//a/text()')
statsLoader.add_value('views_count', '42')
return dict(statsLoader.load_item())
Originally it was not working because my input_processor was MapCompose(remove_tags)
for the stats field. In order to serialize the object you have to return dict(loader.load_item())
and not just return loader.load_item()
Thanks !