问题
I attempt to save the html of a website in a string. The website has international characters (ę, ś, ć, ...) and they are not being saved to the string even though I set the encoding to be UTF-8 which corresponds to the websites charset.
Here is my code:
using (WebClient client = new WebClient())
{
client.Encoding = Encoding.UTF8;
string htmlCode = client.DownloadString(http://www.filmweb.pl/Mroczne.Widmo);
}
When I print "htmlCode" to the console, the international characters are not shown correctly even though in the original HTML they are shown correctly.
Any help is appreciated.
回答1:
I had the same problem. It seems that client.DownloadString
doesn’t encode the characters using UTF-8. Using client.DownloadData
and encoding the returned data with Encoding.UTF8.GetString
solve the problem.
using (WebClient client = new WebClient())
{
var htmlData = client.DownloadData("http://www.filmweb.pl/Mroczne.Widmo");
var htmlCode = Encoding.UTF8.GetString(htmlData);
}
来源:https://stackoverflow.com/questions/37200465/webclient-downloadstring-utf-8-not-displaying-international-characters