Read non-english characters from http get request

孤街浪徒 提交于 2019-12-19 04:21:50

问题


I have a problem in getting Hebrew characters from a http get request.

I'm getting squares characters like this: "[]" instead of the Hebrew characters.

The English characters are Ok.

This is my function:

public String executeHttpGet(String urlString) throws Exception {
    BufferedReader in = null;
    try {
        HttpClient client = new DefaultHttpClient();
        HttpGet request = new HttpGet();
        request.setURI(new URI(urlString));
        HttpResponse response = client.execute(request);
        in = new BufferedReader(new InputStreamReader(response.getEntity().getContent(),"UTF-8"));
        StringBuffer sb = new StringBuffer("");
        String line = "";
        String NL = System.getProperty("line.separator");
        while ((line = in.readLine()) != null) {
            sb.append(line + NL);
        }
        in.close();
        String page = sb.toString();
        // System.out.println(page);
        return page;
    } finally {
        if (in != null) {
            try {
                in.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

You can test is by this example url:

String str = executeHttpGet("http://kavim-t.co.il/include/getXMLStations.asp?parent=7_%20_1");

Thank you!


回答1:


The file you linked to doesn't seem to be UTF-8. I tested that it opens correctly using WINDOWS-1255 (hebrew encoding), you should try that instead of UTF-8.




回答2:


Try a different website, it looks like it doesn't use UTF-8. Alternatively, UTF-16 may work but I haven't tried. Your code looks fine.




回答3:


As others have pointed out, the content is not actually encoded as UTF-8. You might want to look at httpEntity.getContentType() to extract the actual encoding of the content, and then pass this to your InputStreamReader. This means your code will then be able to cope correctly with any encoding.




回答4:


hi as is posted in this other question Special characters in PHP / MySQL

you can set the characters on the php file on the example they set utf-8, but you can set a different type that supports the chararcters you need.



来源:https://stackoverflow.com/questions/9430982/read-non-english-characters-from-http-get-request

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!