问题
Given that I have a following xml:
<div id="Main">
<div class="quote">
This is a quote and I don't want this text
</div>
<p>
This is content.
</p>
<p>
This is also content and I want both of them
</p>
</div>
Is there "a XPath" to help me select inner text of div#Main as a single node, but must exclude texts of any div.quote.
I just want the text: "This is content.This is also content and I want both of them"
Thanks in advance
Here is the code to test the XPath, I'm using .NET with HtmlAgilityPack but I believe the xPath should work with any languages
[Test]
public void TestSelectNode()
{
// Arrange
var html = "<div id=\"Main\"><div class=\"quote\">This is a quote and I don't want this text</div><p>This is content.</p><p>This is also content and I want both of them</p></div>";
var xPath = "//div/*[not(self::div and @class=\"quote\")]/text()";
var doc = new HtmlDocument();
doc.LoadHtml(html);
// Action
var node = doc.DocumentNode.SelectSingleNode(xPath);
// Assert
Assert.AreEqual("This is content.This is also content and I want both of them", node.InnerText);
}
The test was failed obviously because the xPath is still not correct.
Test 'XPathExperiments/TestSelectNode' failed:
Expected values to be equal.
Expected Value : "This is content.This is also content and I want both of them"
Actual Value : "This is content."
回答1:
I don't think there is an XPath that will give you this as a single node, because the values you're trying to obtain aren't a single node. Is there a reason you can't do this?
StringBuilder sb = new StringBuilder();
// Action
var nodes = doc.DocumentNode.SelectNodes(xPath);
foreach(var node in nodes)
{
sb.Append(node.InnerText);
}
// Assert
Assert.AreEqual("This is content.This is also content and I want both of them",
sb.ToString());
回答2:
You want the text of any child of the div who is not div with class quote:
div/*[not(self::div and @class="quote")]/text()
回答3:
There's no XPath that would give you a combined string value, because XPath selects node objects and only node objects, even if they're text nodes.
Seeing as you have <p>
nodes in the <div>
in question, I'd use
div[@id='Main']/p/text()
which produces a list of text nodes in <p>
elements in a <div id="Main">
. Iterating through these and concatenating text contents should be simple.
来源:https://stackoverflow.com/questions/14614318/xpath-select-text-of-selected-child-nodes