How do I scrape information off ASP.NET websites when paging and JavaScript links are being used?

后端 未结 2 569
轻奢々
轻奢々 2021-01-12 19:27

I have been given a staff list which is supposed to be up to date but it doesn\'t match an intranet People Finder which is written in ASP.NET.

As the information is

2条回答
  •  走了就别回头了
    2021-01-12 20:06

    You could post a variable to the HTML page to go through the paging.

    string lcUrl = "http://www.mysite.com/page.aspx";
    
    HttpWebRequest loHttp =
    
       (HttpWebRequest) WebRequest.Create(lcUrl);
    
    
    // *** Send any POST data
    
    string lcPostData =
    
       "gvEmployees=" + HttpUtility.UrlEncode("Page$2");
    
    loHttp.Method="POST";
    
    byte [] lbPostBuffer = System.Text.           
    
                           Encoding.GetEncoding(1252).GetBytes(lcPostData);
    
    loHttp.ContentLength = lbPostBuffer.Length;
    
    Stream loPostData = loHttp.GetRequestStream();
    
    loPostData.Write(lbPostBuffer,0,lbPostBuffer.Length);
    
    loPostData.Close();
    
    HttpWebResponse loWebResponse = (HttpWebResponse) loHttp.GetResponse();
    
    Encoding enc = System.Text.Encoding.GetEncoding(1252);
    
    StreamReader loResponseStream =
    
       new StreamReader(loWebResponse.GetResponseStream(),enc);
    
    string lcHtml = loResponseStream.ReadToEnd();
    
    loWebResponse.Close();
    
    loResponseStream.Close();
    

    Then parse out the data you need from the string.

    --EDIT--

    Here is what I would try (something similar) where all of the post data is sent:

    string lcPostData =
    
           "__EVENTTARGET" + HttpUtility.UrlEncode("gvEmployees"); &
    "__EVENTARGUMENT" + HttpUtility.UrlEncode("Page%242"); &
    "__VIEWSTATE" + HttpUtility.UrlEncode("");
    

提交回复
热议问题