How to pass cookies to HtmlAgilityPack or WebClient?

后端 未结 3 1622
一向
一向 2020-12-17 20:02

I use this code to login:

CookieCollection cookies = new CookieCollection();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(\"example.com\");
req         


        
相关标签:
3条回答
  • 2020-12-17 20:34

    There are some recommendations here: Using CookieContainer with WebClient class

    However, it's probably just easier to keep using the HttpWebRequest and set the cookie in the CookieContainer:

    • HTTPWebRequest and CookieContainer
    • http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.cookiecontainer.aspx

    The code looks something like this:

     // Create a HttpWebRequest
    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(getUrl);
    
    // Create the cookie container and add a cookie
    request.CookieContainer = new CookieContainer();
    
    // Add all the cookies
    foreach (Cookie cookie in response.Cookies)
    {
        request.CookieContainer.Add(cookie);
    }
    

    The second thing is that you don't need to download the site again, since you already have it from your web response and you're saving it here:

    HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
    using (StreamReader sr = new StreamReader(getResponse.GetResponseStream(), Encoding.GetEncoding("windows-1251")))
    {
            webBrowser1.DocumentText = doc.DocumentNode.OuterHtml;
    }
    

    You should be able to just take the HTML and parse it with the HTML Agility Pack:

    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(webBrowser1.DocumentText);
    

    And that should do it... :)

    0 讨论(0)
  • 2020-12-17 20:34

    Try caching cookies from previous response locally and resend them each web request as follows:

    private CookieCollection cookieCollection;
    
    ...
    
        parserObject = new HtmlWeb
                    {
                        AutoDetectEncoding = true,
                        PreRequest = request =>
                        {
                            if (cookieCollection != null)
                                cookieCollection.Cast<Cookie>()
                                    .ForEach(cookie => request.CookieContainer.Add(cookie));
                            return true;
                        },
                        PostResponse = (request, response) => { cookieCollection = response.Cookies; }
                    };
    
    0 讨论(0)
  • 2020-12-17 20:42

    Check HtmlAgilityPack.HtmlDocument Cookies

    Here is an example of what you're looking for (syntax not 100% tested, I just modified some class I usually use):

    public class MyWebClient
    {
        //The cookies will be here.
        private CookieContainer _cookies = new CookieContainer();
    
        //In case you need to clear the cookies
        public void ClearCookies() {
            _cookies = new CookieContainer();
        }
    
        public HtmlDocument GetPage(string url) {
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
            request.Method = "GET";
    
            //Set more parameters here...
            //...
    
            //This is the important part.
            request.CookieContainer = _cookies;
    
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();
            var stream = response.GetResponseStream();
    
            //When you get the response from the website, the cookies will be stored
            //automatically in "_cookies".
    
            using (var reader = new StreamReader(stream)) {
                string html = reader.ReadToEnd();
                var doc = new HtmlDocument();
                doc.LoadHtml(html);
                return doc;
            }
        }
    }
    

    Here is how you use it:

    var client = new MyWebClient();
    HtmlDocument doc = client.GetPage("http://somepage.com");
    
    //This request will be sent with the cookies obtained from the page
    doc = client.GetPage("http://somepage.com/another-page");
    

    Note: If you also want to use POST method, just create a method similar to GetPage with the POST logic, refactor the class, etc.

    0 讨论(0)
提交回复
热议问题