问题
In the below code, I can set the value of the username and password using the HTMLAgilitypack but I cannot invoke the click event of the login button (the id in the source code of the button is "s1").
Is there anyway for this to be done? The reason I'm not using the WebBrowser
is because I will need the HTMLAgilityPack to retrieve data from the page without IDs in the source code.
var doc = new HtmlWeb().Load("http://MYURL.com");
doc.DocumentNode.SelectSingleNode("name").SetAttributeValue("value", "MyUsername");
doc.DocumentNode.SelectSingleNode("password").SetAttributeValue("value", "MyPassword");
回答1:
Is there anyway for this to be done?
Not with what the HTML Agility Pack (HAP) library provides - not directly.
The HAP is great for getting a single page and parsing it, but it is not designed for continued interactions. Things that are missing are cookie management, JavaScript interaction and more.
In order to login you probably need to send an HTTP POST to the server, including the data you want - the HAP can't help with that.
You will need to use a class like WebRequest
to make the post - I suggest looking at fiddler and using it to see what the request should look like and constructing it accordingly, though that may just be the first step.
You may want to investigate the use of web automation tools such as selenium or WatiN instead.
回答2:
You need to observe the POST request via fiddler and see how it's structured. for instance :
{"userName":"you","password":"pwd"}
Usually, a site would recognize that you are logged in by receiving their cookie in your requests.
HttpClient by default sends the cookies received from a specific domain with each sequential request to that domain (Until you dispose that HttpClient instance)
1) Create a cookie container and assigned it to your HttpClient instance.
2) Use HttpClient to make the login POST request.
3) Use HttpClient to make the data GET request.
4) Read the html string from the response.
5) Use HtmlAgilityPack HtmlDocument to load the document from the html string and not from the web (as most examples show).
string baseUrl = "https://www.yourwebsite.com";
string loginUrl = "/Account/LogOn";
string sessionUrl = "/Data";
var uri = new Uri(baseUrl);
CookieContainer cookies = new CookieContainer();
HttpClientHandler handler = new HttpClientHandler();
handler.CookieContainer = cookies;
using (var client = new HttpClient(handler))
{
client.BaseAddress = uri;
var request = new { userName = "you", password = "pwd" };
var resLogin = client.PostAsJsonAsync(loginUrl,request).Result;
if (resLogin.StatusCode != HttpStatusCode.OK)
Console.WriteLine("Could not login -> StatusCode = " + resLogin.StatusCode);
// see what cookies are returned
IEnumerable<Cookie> responseCookies = cookies.GetCookies(uri).Cast<Cookie>();
foreach (Cookie cookie in responseCookies)
Console.WriteLine(cookie.Name + ": " + cookie.Value);
var resData = client.GetAsync(dataUrl).Result;
if(resSession.StatusCode != HttpStatusCode.OK)
Console.WriteLine("Could not get data html -> StatusCode = " + resSession.StatusCode);
var html = resSession.Content.ReadAsStringAsync().Result;
var doc = new HtmlDocument();
doc.LoadHtml(html);
}
回答3:
I don't know if you're using the WPF WebBrowser control, but if you are, you can use something along the lines of
doc.GetElementById("submit_signin").Click();
That's what works for me.
来源:https://stackoverflow.com/questions/13568933/login-to-website-using-htmlagilitypack