Windows Phone 8 SDK WebClient Encoding Issue

六眼飞鱼酱① 提交于 2020-01-03 04:22:18

问题


I'm trying to parse html from a site using windows-1254 charset. but all Turkish characters shown like this: � � � � �

Where is the actual problem? I did try these:

webClient.Encoding = System.Text.Encoding.UTF8
webClient.Encoding = System.Text.Encoding.GetString("UTF-8");

as function this:

public string ReplaceText(string _text)
        {
            _text = _text.Replace("Ä°", "İ").Replace("ı", "ı").Replace("ü", "ü").Replace("ÅŸ", "ş").Replace("Å", "Ş").Replace("ç", "ç").Replace("ö", "ö").Replace("ÄŸ", "ğ").Replace("Ç", "Ç").Replace("Ö", "Ö").Replace("Ãœ", "Ü");
            return _text;
        }

also this headers:

webClient.Headers["User-Agent"] = "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0)";
webClient.Headers["Accept-Charset"] = "windows-1254,utf-8;q=0.7,*;q=0.7";

(with iso-8859-9, utf8 too)

and this is how i am using the webclient:

WebClient wb = new WebClient();         
            wb.Headers["User-Agent"] = "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0)";
            wb.Headers["Accept-Charset"] = "windows-1254,utf-8;q=0.7,*;q=0.7";
            wb.DownloadStringAsync(new Uri("http://www.site.com"));
            wb.Encoding = System.Text.Encoding.UTF8;
            wb.DownloadStringCompleted += new DownloadStringCompletedEventHandler(DSC);

handler:

HtmlDocument htmlDoc = new HtmlDocument();

            htmlDoc.LoadHtml(e.Result);

            var inputs = htmlDoc.DocumentNode.SelectNodes("//div[@id=\"mrln-eyhaber\"]//a");

            foreach (var input in inputs)
            {

                textarea.Text += this.ReplaceText(input.Attributes["title"].Value.ToString()) + "\n\n";
            }

回答1:


Instead of using standard approach, why don you create a custom class, specific for your needs, which will handle the etconding.

This will help you generate the class, like so:

and then all you have to do

webClient.Encoding = CustomEncoding();

Let me know how it goes (:




回答2:


Why did you set the Encoding to UTF-8 if you know its windows-1254? The fix is rather easy, you just have to set the correct encoding in the WebClient.

wb.Encoding = Encoding.GetEncoding(1254);

or

wb.Encoding = Encoding.GetEncoding("windows-1254");

Also, your ReplaceText method shouldn't be needed anymore either.

EDIT: Of course, Windows Phone doesn't support that encoding just like that, you have to implement any encoding other than utf-8 or utf-16 yourself. Luckily there's an easy way to do that, you just have to use the programm described here to generate your own encoding class.



来源:https://stackoverflow.com/questions/19073787/windows-phone-8-sdk-webclient-encoding-issue

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!