问题
I want to get a value of an attribute by HtmlAgilityPack. Html code:
<link href="style.css">
<link href="anotherstyle.css">
<link href="anotherstyle2.css">
<link itemprop="thumbnailUrl" href="http://image.jpg">
<link href="anotherstyle5.css">
<link href="anotherstyle7.css">
I want to get last href attribute.
My c# code:
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument htmldoc = web.Load(Url);
htmldoc.OptionFixNestedTags = true;
var navigator = (HtmlNodeNavigator)htmldoc.CreateNavigator();
string xpath = "//link/@href";
string val = navigator.SelectSingleNode(xpath).Value;
But that code return first href value.
回答1:
Following XPath selects link
elements which have href
attribute defined. Then from links you are selecting last one:
var link = doc.DocumentNode.SelectNodes("//link[@href]").LastOrDefault();
// you can also check if link is not null
var href = link.Attributes["href"].Value; // "anotherstyle7.css"
You can also use last()
XPath operator
var link = doc.DocumentNode.SelectSingleNode("/link[@href][last()]");
var href = link.Attributes["href"].Value;
UPDATE: If you want to get last element which has both itemprop
and href
attributes, then use XPath //link[@href and @itemprop][last()]
or //link[@href and @itemprop]
if you'll go with first approach.
回答2:
you need something like that:
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument htmldoc = web.Load(Url);
htmldoc.OptionFixNestedTags = true;
var navigator = (HtmlNodeNavigator)htmldoc.CreateNavigator();
string xpath = "//link[@itemprop]/@href";
string val = navigator.SelectSingleNode(xpath).Value;
回答3:
load the webpage as Htmldocument and directly select the last link tag.
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(Url);
var output = doc.DocumentNode.SelectNodes("//link[@href]").LastOrDefault();
var data = output.Attributes["href"].Value;
or load the webpage as Htmldocument and get the collection of all selected link tags then travel using loop then access last select tag attribute.
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(Url);
int count = 0;
string data = "";
var output = doc.DocumentNode.SelectNodes("//link[@href]");
foreach (var item in output)
{
count++;
if (count == output.Count)
{
data=item.Attributes["href"].Value;
break;
}
}
回答4:
Ok, I came to this:
var link = htmldoc.DocumentNode.SelectSingleNode("//link[@itemprop='thumbnailUrl']");
var href = link.Attributes["href"].Value;
回答5:
Get a HtmlNode by attribute value:
public static class Extensions
{
public static HtmlNode GetNodeByAttributeValue(this HtmlNode htmlNode, string attributeName, string attributeValue)
{
if (htmlNode.Attributes.Contains(attributeName))
{
if (string.Compare(htmlNode.Attributes[attributeName].Value, attributeValue, true) == 0)
{
return htmlNode;
}
}
foreach (var childHtmlNode in htmlNode.ChildNodes)
{
var resultNode = GetNodeByAttributeValue(childHtmlNode, attributeName, attributeValue);
if (resultNode != null) return resultNode;
}
return null;
}
}
Usage
var searchResultsDiv = pageDocument.DocumentNode.GetNodeByAttributeValue("someattributename", "resultsofsearch");
来源:https://stackoverflow.com/questions/21236359/get-a-value-of-an-attribute-by-htmlagilitypack