login to website using HTMLAgilityPack

北城余情 提交于 2019-12-12 07:26:18

问题


In the below code, I can set the value of the username and password using the HTMLAgilitypack but I cannot invoke the click event of the login button (the id in the source code of the button is "s1").

Is there anyway for this to be done? The reason I'm not using the WebBrowser is because I will need the HTMLAgilityPack to retrieve data from the page without IDs in the source code.

var doc = new HtmlWeb().Load("http://MYURL.com");
doc.DocumentNode.SelectSingleNode("name").SetAttributeValue("value", "MyUsername");
doc.DocumentNode.SelectSingleNode("password").SetAttributeValue("value", "MyPassword");

回答1:


Is there anyway for this to be done?

Not with what the HTML Agility Pack (HAP) library provides - not directly.

The HAP is great for getting a single page and parsing it, but it is not designed for continued interactions. Things that are missing are cookie management, JavaScript interaction and more.

In order to login you probably need to send an HTTP POST to the server, including the data you want - the HAP can't help with that.

You will need to use a class like WebRequest to make the post - I suggest looking at fiddler and using it to see what the request should look like and constructing it accordingly, though that may just be the first step.

You may want to investigate the use of web automation tools such as selenium or WatiN instead.




回答2:


You need to observe the POST request via fiddler and see how it's structured. for instance :

    {"userName":"you","password":"pwd"}

Usually, a site would recognize that you are logged in by receiving their cookie in your requests.

HttpClient by default sends the cookies received from a specific domain with each sequential request to that domain (Until you dispose that HttpClient instance)

1) Create a cookie container and assigned it to your HttpClient instance.

2) Use HttpClient to make the login POST request.

3) Use HttpClient to make the data GET request.

4) Read the html string from the response.

5) Use HtmlAgilityPack HtmlDocument to load the document from the html string and not from the web (as most examples show).

 string baseUrl = "https://www.yourwebsite.com";
 string loginUrl = "/Account/LogOn"; 
 string sessionUrl = "/Data";

 var uri = new Uri(baseUrl);

 CookieContainer cookies = new CookieContainer();
 HttpClientHandler handler = new HttpClientHandler();
 handler.CookieContainer = cookies;

 using (var client = new HttpClient(handler))
 {
       client.BaseAddress = uri;

       var request = new { userName = "you", password = "pwd" };
       var resLogin = client.PostAsJsonAsync(loginUrl,request).Result;
       if (resLogin.StatusCode != HttpStatusCode.OK)
            Console.WriteLine("Could not login -> StatusCode = " + resLogin.StatusCode);

       // see what cookies are returned   
      IEnumerable<Cookie> responseCookies = cookies.GetCookies(uri).Cast<Cookie>();
      foreach (Cookie cookie in responseCookies)
            Console.WriteLine(cookie.Name + ": " + cookie.Value);

      var resData = client.GetAsync(dataUrl).Result;
      if(resSession.StatusCode != HttpStatusCode.OK)
            Console.WriteLine("Could not get data html -> StatusCode = " + resSession.StatusCode);

       var html = resSession.Content.ReadAsStringAsync().Result;

       var doc = new HtmlDocument();
       doc.LoadHtml(html);
 }



回答3:


I don't know if you're using the WPF WebBrowser control, but if you are, you can use something along the lines of

doc.GetElementById("submit_signin").Click();

That's what works for me.



来源:https://stackoverflow.com/questions/13568933/login-to-website-using-htmlagilitypack

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!