Scrapy CSS Selector ignore tags and get text only

纵然是瞬间 提交于 2019-12-11 06:13:28

问题


I have the following HTML :

<li class="last">
    <span>SKU:</span> 483151
</li>

I was able to select them using :

SKU_SELECTOR = '.aaa .bbb .last ::text'
sku = response.css(SKU_SELECTOR).extract_first().strip()

How can I get the number only and ignore the span.


回答1:


Your css selector has unnecessary space before ::text.

SKU_SELECTOR = '.aaa .bbb .last ::text'
                               ^

Space indicates that any decendant-or-self node qualifies for this selector where you want to select only text under self.

I got it working:

>[0]: s = Selector(tex='...')
>[1]: s.css('.last::text').extract()
<[1]: [u'\n    ', u' 483151\n']


来源:https://stackoverflow.com/questions/44260041/scrapy-css-selector-ignore-tags-and-get-text-only

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!