For example:
scrapy shell http://scrapy.org/
content = hxs.select(\'//*[@id=\"content\"]\').extract()[0]
print content
Then, I get the followin
At this moment, I don't think you need to install any 3rd party library. scrapy provides this functionality using selectors:
Assume this complex selector:
sel = Selector(text='Click here to go to the Next Page')
we can get the entire text using:
text_content = sel.xpath("//a[1]//text()").extract()
# which results [u'Click here to go to the ', u'Next Page']
then you can join them together easily:
' '.join(text_content)
# Click here to go to the Next Page