Can anyone please suggest an XPath expression format that returns a string value containing the concatenated values of certain qualifying child nodes of an element, but igno
If you want all children except p, you can try the following...
string-join(//*[name() != 'p']/text(), "")
which returns...
This text node should be returned.
And the value of this element.
And this.
You could use a for-each loop as well and assemble the values in a variable like this
<xsl:variable name="newstring">
<xsl:for-each select="/div//text()">
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:variable>
In XPath 1.0:
You can use
/div//text()[not(parent::p)]
to capture the wanted text nodes. The concatenation itself cannot be done in XPath 1.0, I recommend doing it in the host application.
/div//text()
double slash forces to extract text regardless of intermediate nodes
In XPath 2.0 :
string-join(/*/node()[not(self::p)], '')
I know this comes a bit late, but I figure my answer could still be relevant. I recently ran into a similar problem. And because I use scrapy
in Python 3.6, which does not support xpath 2.0, I could not use the string-join
function suggested in several online answers.
I ended up finding a simple workaround (as shown below) which I did not see in any of the stackoverflow answers, that's why I'm sharing it.
temp_selector_list = response.xpath('/div')
string_result = [''.join(x.xpath(".//text()").extract()) for x in temp_selector_list]
Hope this helps!