WebDriver can find element using xpath, Html Agility Pack cannot

Deadly 提交于 2019-12-30 10:59:06

问题


I have continually had problems with Html Agility Pack; my XPath queries only ever work when they are extremely simple:

//*[@id='some_id']

or

//input

However, anytime they get more complicated, then Html Agility Pack can't handle it. Here's an example demonstrating the problem, I'm using WebDriver to navigate to Google, and return the page source, which is passed to Html Agility Pack, and both WebDriver and HtmlAgilityPack attempt to locate the element/node (C#):

//The XPath query
const string xpath = "//form//tr[1]/td[1]//input[@name='q']";

//Navigate to Google and get page source
var driver = new FirefoxDriver(new FirefoxProfile()) { Url = "http://www.google.com" };
Thread.Sleep(2000);

//Can WebDriver find it?
var e = driver.FindElementByXPath(xpath);
Console.WriteLine(e!=null ? "Webdriver success" : "Webdriver failure");

//Can Html Agility Pack find it?
var source = driver.PageSource;
var htmlDoc = new HtmlDocument { OptionFixNestedTags = true };
htmlDoc.LoadHtml(source);
var nodes = htmlDoc.DocumentNode.SelectNodes(xpath);
Console.WriteLine(nodes!=null ? "Html Agility Pack success" : "Html Agility Pack failure");

driver.Quit();

In this case, WebDriver successfully located the item, but Html Agility Pack did not.

I know, I know, in this case it's very easy to change the xpath to one that will work: //input[@name='q'], but that will only fix this specific example, which isn't the point, I need something that will exactly or at least closely mirror the behavior of WebDriver's xpath engine, or even the FirePath or FireFinder add-ons to Firefox.

If WebDriver can find it, then why can't Html Agility Pack find it too?


回答1:


The issue you're running into is with the FORM element. HTML Agility Pack handles that element differently - by default, it will never report that it has children.

In the particular example you gave, this query does find the target element:

.//div/div[2]/table/tr/td/table/tr/td/div/table/tr/td/div/div[2]/input

However, this does not, so it's clear the form element is tripping up the parser:

.//form/div/div[2]/table/tr/td/table/tr/td/div/table/tr/td/div/div[2]/input

That behavior is configurable, though. If you place this line prior to parsing the HTML, the form will give you child nodes:

HtmlNode.ElementsFlags.Remove("form");


来源:https://stackoverflow.com/questions/6127769/webdriver-can-find-element-using-xpath-html-agility-pack-cannot

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!