How I can get web page's content and save it into the string variable

前端 未结 4 1522
走了就别回头了
走了就别回头了 2020-11-28 03:59

How I can get the content of the web page using ASP.NET? I need to write a program to get the HTML of a webpage and store it into a string variable.

相关标签:
4条回答
  • 2020-11-28 04:07
    Webclient client = new Webclient();
    string content = client.DownloadString(url);
    

    Pass the URL of page who you want to get. You can parse the result using htmlagilitypack.

    0 讨论(0)
  • 2020-11-28 04:21

    I recommend not using WebClient.DownloadString. This is because (at least in .NET 3.5) DownloadString is not smart enough to use/remove the BOM, should it be present. This can result in the BOM () incorrectly appearing as part of the string when UTF-8 data is returned (at least without a charset) - ick!

    Instead, this slight variation will work correctly with BOMs:

    string ReadTextFromUrl(string url) {
        // WebClient is still convenient
        // Assume UTF8, but detect BOM - could also honor response charset I suppose
        using (var client = new WebClient())
        using (var stream = client.OpenRead(url))
        using (var textReader = new StreamReader(stream, Encoding.UTF8, true)) {
            return textReader.ReadToEnd();
        }
    }
    
    0 讨论(0)
  • 2020-11-28 04:23

    I've run into issues with Webclient.Downloadstring before. If you do, you can try this:

    WebRequest request = WebRequest.Create("http://www.google.com");
    WebResponse response = request.GetResponse();
    Stream data = response.GetResponseStream();
    string html = String.Empty;
    using (StreamReader sr = new StreamReader(data))
    {
        html = sr.ReadToEnd();
    }
    
    0 讨论(0)
  • 2020-11-28 04:25

    You can use the WebClient

    WebClient client = new WebClient();
    string downloadString = client.DownloadString("http://www.gooogle.com");
    
    0 讨论(0)
提交回复
热议问题