问题
I have some html, which is separated by <br/>
e.g.:
Jack Janson
<br/>
309 123 456
<br/>
My Special Street 43
What is the easiest way to retrieve the information in 3 columns?
I am not an XPath expert, so another approach would be to separate the string on the line break, and just work with the array. Is there a smarter way to do it?
Update: Forgot to format the code.
回答1:
In pure XPATH over XML, you would use an XPATH expression like this: //preceding-sibling::br
or //following-sibling::br
(see here for help on XPATH Axes)
But, the XPATH over HTML implementation that you'll find in Html Agility Pack does not support pure text node or (Attribute node) in XPATH selection expressions (//br/text()
or //br/@blah
do not work for example). Note it works in filters, so, these //br[text()='blah']
or //br[@att='blah']
work.
So, back to the question, you need to combine XPATH and code, something like this:
HtmlDocument doc = new HtmlDocument();
doc.Load(myHtmlFile);
foreach (HtmlNode p in doc.DocumentNode.SelectNodes("//br"))
{
Console.WriteLine(p.PreviousSibling.InnerText.Trim());
}
That will output
Jack Janson
309 123 456
来源:https://stackoverflow.com/questions/6102761/htmlagilitypack-and-separating-on-br