WebClient problem with URL which ends with a period

后端 未结 5 1805
无人及你
无人及你 2021-01-21 13:48

I\'m running the following code;

using (WebClient wc = new WebClient())
{
    string page = wc.DownloadString(URL);
    ...
}

To access the URL

5条回答
  •  傲寒
    傲寒 (楼主)
    2021-01-21 14:24

    It seems you found a bug in WebClient/WebRequest, though perhaps Microsoft put that in intentionally, who knows. Nonetheless, when you pass in TW., the URI class is translating that to TW without the period. Since WebClient/WebRequest parse strings into URI, your . is disappearing in that world.

    You may have to use TcpClient to get around this and roll your own web client. Any variation of this:

    TcpClient oClient = new TcpClient("www.shareprice.co.uk", 80);
    
    NetworkStream ns = oClient.GetStream();
    
    StreamWriter sw = new StreamWriter(ns);
    sw.Write(
       string.Format( 
          "GET /{0} HTTP/1.1\r\nUser-Agent: {1}\r\nHost: www.shareprice.co.uk\r\n\r\n",
               "TW.", 
               "MyTCPClient"  )
    );                    
    sw.Flush();
    
    StringBuilder sb = new StringBuilder();
    
    while (true)
    {
        int i = ns.ReadByte(); // Inefficient but more reliable 
        if (i == -1) break;  // Other side has closed socket 
        sb.Append( (char) i );   // Accrue 'c' to save page data 
    }
    
    oClient.Close();
    

    This will give you a 302 redirect, so just parse out the 'Location:' and execute the above again with the new location.

    HTTP/1.1 302 Found
    Date: Wed, 11 Nov 2009 19:29:27 GMT
    Server: lighttpd
    X-Powered-By: PHP/5.2.4-2ubuntu5.7
    Expires: Thu, 19 Nov 1981 08:52:00 GMT
    Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
    Pragma: no-cache
    Location: /TW./TAYLOR-WIMPEY-PLC
    Content-type: text/html; charset=UTF-8
    Content-Length: 0
    Set-Cookie: SSID=668d5d0023e9885e1ef3762ef5e44033; path=/
    Vary: Accept-Encoding
    Connection: close
    

提交回复
热议问题