html-agility-pack | 易学教程

Parsing HTML using Xpath with Javascript

阅读更多关于 Parsing HTML using Xpath with Javascript

问题 In .NET there is a lovely library that allows me to easily parse an external html page using xpath queries (HTML Agility Project) - the problem is I have to do that client-side, so only javascript. Is there any way to do that? 回答1: jQuery also supports xPath selector as well CSS, you can get more information from the link below. http://docs.jquery.com/DOM/Traversing/Selectors 回答2: You can try it https://github.com/andrejpavlovic/xpathjs Actually there are a lot of it and there is an window

vb.net HtmlAgilityPack Insert string after div

阅读更多关于 vb.net HtmlAgilityPack Insert string after div

问题 I'm trying to inset some of my own html directly after the end of a div. This div has other div inside of it. Dim HtmlNode As HtmlNode = HtmlNode.CreateNode("<span class=""label"">Those were the friends</span>") Dim FriendDiv = htmldoc.DocumentNode.SelectSingleNode("//div[@class='profile_friends']") Dim NewHTML As HtmlNode = htmldoc.DocumentNode.InsertAfter(HtmlNode, FriendDiv) Every time I run that code I get an exception Node "<div class="profile_topfriends"></div>" was not found in the

Can I use HtmlAgilityPack to split an HTML document on a certain tag?

阅读更多关于 Can I use HtmlAgilityPack to split an HTML document on a certain tag?

问题 For example, I have a bunch of <tr> tags I'd like to collect. I need to split each of these tags into individual elements, for easier parsing on my part. Is this possible? An example of the markup: <tr class="first-in-year"> <td class="year">2011</td> <td class="img"><a href="/battlefield-3/61-27006/"><img src= "http://media.giantbomb.com/uploads/6/63038/1700748-bf3_thumb.jpg" alt=""></a></td> <td class="title"> <a href="/battlefield-3/61-27006/">Battlefield 3</a> <p class="deck">Battlefield

Html agility pack not loading url

阅读更多关于 Html agility pack not loading url

问题 I have something like this: class MyTask { public MyTask(int id) { Id = id; IsBusy = false; Document = new HtmlDocument(); } public HtmlDocument Document { get; set; } public int Id { get; set; } public bool IsBusy { get; set; } } class Program { public static void Main() { var task = new MyTask(1); task.Document.LoadHtml("http://urltomysite"); if (task.Document.DocumentNode.SelectNodes("//span[@class='some-class']").Count == 0) { task.IsBusy = false; return; } } } Now when I start my program

If Html File Has No Ending “/tr” Tag OR “/td” Tag Then HTML Agility Pack Does Not Read That Information Perfectly

阅读更多关于 If Html File Has No Ending “/tr” Tag OR “/td” Tag Then HTML Agility Pack Does Not Read That Information Perfectly

问题 I am using HTML Agility Pack to parse html content. I am using parsing to extract table information. It works. But if there is no ending "/tr" tag or "/td" tag then it does not parse that information perfectly.(in which there is no ending tr tag or td tag.) Like <html> <head> <meta name="generator" content= "HTML Tidy for Windows (vers 14 February 2006), see www.w3.org"> <title></title> </head> <body> <table cellspacing="0" cellpadding="0" width="100%" border="0"> <tbody> <tr> <td class="xl27

HtmlAgilityPack XPath case ignoring

阅读更多关于 HtmlAgilityPack XPath case ignoring

问题 When I use SelectSingleNode("//meta[@name='keywords']") it doesn't work, but when I use the same case that used in original document it works good: SelectSingleNode("//meta[@name='Keywords']") So the question is how can I set case ignoring? 回答1: If you need a more comprehensive solution, you can write an extension function for the XPath processor which will perform a case insensitive comparison. It is quite a bit of code, but you only write it once. After implementing the extension you can

XPath - select text of selected child nodes

阅读更多关于 XPath - select text of selected child nodes

问题 Given that I have a following xml: <div id="Main"> <div class="quote"> This is a quote and I don't want this text </div> <p> This is content. </p> <p> This is also content and I want both of them </p> </div> Is there "a XPath" to help me select inner text of div#Main as a single node , but must exclude texts of any div.quote . I just want the text: "This is content.This is also content and I want both of them" Thanks in advance Here is the code to test the XPath, I'm using .NET with

Html Agility Pack get specific content from a <li> tag

阅读更多关于 Html Agility Pack get specific content from a tag

问题 I need some text from this website https://www.amazon.com/dp/B074J9SSPD, to be specific, I need to extract data under the "About the Product" section. I tried HtmlWeb web = new HtmlWeb(); HtmlDocument doc = new HtmlDocument(); doc = web.Load("https://amazon.com/dp/B074J9SSPD"); foreach(var node in doc.DocumentNode.SelectNodes("//li[@class='showHiddenFeatureBullets']") { string ar = node.InnerText; HtmlAttribute att = node.Attributes["class"]; MessageBox.Show(ar.ToString()); if (att.Value

Calling javascript function from HtmlAgilityPack

阅读更多关于 Calling javascript function from HtmlAgilityPack

问题 I want to use HtmlAgilityPack in a form application to read some pages content but on the page search subpage I need to invoke the javascript and the link looks like this: <a href="javascript:__doPostBack('lnkbtnNext','')" id="lnkbtnNext">Następny >></a> How can I Call this function from my C# desktop application? 回答1: If you trust the source, it looks to me like you'd be better off invoking the WebBrowser control. HtmlAgilityPack does not provide a scripting engine. 来源： https://stackoverflow

select an element next to current element HtmlAgilityPack

阅读更多关于 select an element next to current element HtmlAgilityPack

问题 I'm using HtmlAgilityPack to parsing html page. I want to select a collection of tag h3 then loop through it, and for each h3 element, i want to select a element right next to it. Here is my sample Html: <h3>Somthing here</h3> <ul>list of something</ul> <h3>Somthing here</h3> <ul>list of something</ul> <h3>Somthing here</h3> <ul>list of something</ul> <h3>Somthing here</h3> <ul>list of something</ul> <h3>Somthing here</h3> <ul>list of something</ul> I know how to select collection of h3, but