WebClient DownloadString UTF-8 not displaying international characters

杀马特。学长 韩版系。学妹 提交于 2019-12-07 03:05:38

问题


I attempt to save the html of a website in a string. The website has international characters (ę, ś, ć, ...) and they are not being saved to the string even though I set the encoding to be UTF-8 which corresponds to the websites charset.

Here is my code:

using (WebClient client = new WebClient())
{
    client.Encoding = Encoding.UTF8;
    string htmlCode = client.DownloadString(http://www.filmweb.pl/Mroczne.Widmo);
}

When I print "htmlCode" to the console, the international characters are not shown correctly even though in the original HTML they are shown correctly.

Any help is appreciated.


回答1:


I had the same problem. It seems that client.DownloadString doesn’t encode the characters using UTF-8. Using client.DownloadData and encoding the returned data with Encoding.UTF8.GetString solve the problem.

using (WebClient client = new WebClient())
{
     var htmlData = client.DownloadData("http://www.filmweb.pl/Mroczne.Widmo");
     var htmlCode = Encoding.UTF8.GetString(htmlData);
}


来源:https://stackoverflow.com/questions/37200465/webclient-downloadstring-utf-8-not-displaying-international-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!