html-agility-pack

Parsing HTML using Xpath with Javascript

99封情书 提交于 2019-12-23 18:19:50
问题 In .NET there is a lovely library that allows me to easily parse an external html page using xpath queries (HTML Agility Project) - the problem is I have to do that client-side, so only javascript. Is there any way to do that? 回答1: jQuery also supports xPath selector as well CSS, you can get more information from the link below. http://docs.jquery.com/DOM/Traversing/Selectors 回答2: You can try it https://github.com/andrejpavlovic/xpathjs Actually there are a lot of it and there is an window

vb.net HtmlAgilityPack Insert string after div

孤街浪徒 提交于 2019-12-23 13:20:51
问题 I'm trying to inset some of my own html directly after the end of a div. This div has other div inside of it. Dim HtmlNode As HtmlNode = HtmlNode.CreateNode("<span class=""label"">Those were the friends</span>") Dim FriendDiv = htmldoc.DocumentNode.SelectSingleNode("//div[@class='profile_friends']") Dim NewHTML As HtmlNode = htmldoc.DocumentNode.InsertAfter(HtmlNode, FriendDiv) Every time I run that code I get an exception Node "<div class="profile_topfriends"></div>" was not found in the

Can I use HtmlAgilityPack to split an HTML document on a certain tag?

谁说我不能喝 提交于 2019-12-23 12:36:06
问题 For example, I have a bunch of <tr> tags I'd like to collect. I need to split each of these tags into individual elements, for easier parsing on my part. Is this possible? An example of the markup: <tr class="first-in-year"> <td class="year">2011</td> <td class="img"><a href="/battlefield-3/61-27006/"><img src= "http://media.giantbomb.com/uploads/6/63038/1700748-bf3_thumb.jpg" alt=""></a></td> <td class="title"> <a href="/battlefield-3/61-27006/">Battlefield 3</a> <p class="deck">Battlefield

Html agility pack not loading url

ε祈祈猫儿з 提交于 2019-12-23 11:13:13
问题 I have something like this: class MyTask { public MyTask(int id) { Id = id; IsBusy = false; Document = new HtmlDocument(); } public HtmlDocument Document { get; set; } public int Id { get; set; } public bool IsBusy { get; set; } } class Program { public static void Main() { var task = new MyTask(1); task.Document.LoadHtml("http://urltomysite"); if (task.Document.DocumentNode.SelectNodes("//span[@class='some-class']").Count == 0) { task.IsBusy = false; return; } } } Now when I start my program

If Html File Has No Ending “/tr” Tag OR “/td” Tag Then HTML Agility Pack Does Not Read That Information Perfectly

房东的猫 提交于 2019-12-23 09:38:54
问题 I am using HTML Agility Pack to parse html content. I am using parsing to extract table information. It works. But if there is no ending "/tr" tag or "/td" tag then it does not parse that information perfectly.(in which there is no ending tr tag or td tag.) Like <html> <head> <meta name="generator" content= "HTML Tidy for Windows (vers 14 February 2006), see www.w3.org"> <title></title> </head> <body> <table cellspacing="0" cellpadding="0" width="100%" border="0"> <tbody> <tr> <td class="xl27

HtmlAgilityPack XPath case ignoring

余生颓废 提交于 2019-12-23 09:18:19
问题 When I use SelectSingleNode("//meta[@name='keywords']") it doesn't work, but when I use the same case that used in original document it works good: SelectSingleNode("//meta[@name='Keywords']") So the question is how can I set case ignoring? 回答1: If you need a more comprehensive solution, you can write an extension function for the XPath processor which will perform a case insensitive comparison. It is quite a bit of code, but you only write it once. After implementing the extension you can

XPath - select text of selected child nodes

為{幸葍}努か 提交于 2019-12-23 05:28:21
问题 Given that I have a following xml: <div id="Main"> <div class="quote"> This is a quote and I don't want this text </div> <p> This is content. </p> <p> This is also content and I want both of them </p> </div> Is there "a XPath" to help me select inner text of div#Main as a single node , but must exclude texts of any div.quote . I just want the text: "This is content.This is also content and I want both of them" Thanks in advance Here is the code to test the XPath, I'm using .NET with

Html Agility Pack get specific content from a <li> tag

核能气质少年 提交于 2019-12-23 04:24:19
问题 I need some text from this website https://www.amazon.com/dp/B074J9SSPD, to be specific, I need to extract data under the "About the Product" section. I tried HtmlWeb web = new HtmlWeb(); HtmlDocument doc = new HtmlDocument(); doc = web.Load("https://amazon.com/dp/B074J9SSPD"); foreach(var node in doc.DocumentNode.SelectNodes("//li[@class='showHiddenFeatureBullets']") { string ar = node.InnerText; HtmlAttribute att = node.Attributes["class"]; MessageBox.Show(ar.ToString()); if (att.Value

Calling javascript function from HtmlAgilityPack

核能气质少年 提交于 2019-12-23 03:14:45
问题 I want to use HtmlAgilityPack in a form application to read some pages content but on the page search subpage I need to invoke the javascript and the link looks like this: <a href="javascript:__doPostBack('lnkbtnNext','')" id="lnkbtnNext">Następny >></a> How can I Call this function from my C# desktop application? 回答1: If you trust the source, it looks to me like you'd be better off invoking the WebBrowser control. HtmlAgilityPack does not provide a scripting engine. 来源: https://stackoverflow

select an element next to current element HtmlAgilityPack

大兔子大兔子 提交于 2019-12-23 01:52:34
问题 I'm using HtmlAgilityPack to parsing html page. I want to select a collection of tag h3 then loop through it, and for each h3 element, i want to select a element right next to it. Here is my sample Html: <h3>Somthing here</h3> <ul>list of something</ul> <h3>Somthing here</h3> <ul>list of something</ul> <h3>Somthing here</h3> <ul>list of something</ul> <h3>Somthing here</h3> <ul>list of something</ul> <h3>Somthing here</h3> <ul>list of something</ul> I know how to select collection of h3, but