问题
I have a problem in getting Hebrew characters from a http get request.
I'm getting squares characters like this: "[]" instead of the Hebrew characters.
The English characters are Ok.
This is my function:
public String executeHttpGet(String urlString) throws Exception {
BufferedReader in = null;
try {
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet();
request.setURI(new URI(urlString));
HttpResponse response = client.execute(request);
in = new BufferedReader(new InputStreamReader(response.getEntity().getContent(),"UTF-8"));
StringBuffer sb = new StringBuffer("");
String line = "";
String NL = System.getProperty("line.separator");
while ((line = in.readLine()) != null) {
sb.append(line + NL);
}
in.close();
String page = sb.toString();
// System.out.println(page);
return page;
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
You can test is by this example url:
String str = executeHttpGet("http://kavim-t.co.il/include/getXMLStations.asp?parent=7_%20_1");
Thank you!
回答1:
The file you linked to doesn't seem to be UTF-8
. I tested that it opens correctly using WINDOWS-1255
(hebrew encoding), you should try that instead of UTF-8
.
回答2:
Try a different website, it looks like it doesn't use UTF-8. Alternatively, UTF-16 may work but I haven't tried. Your code looks fine.
回答3:
As others have pointed out, the content is not actually encoded as UTF-8. You might want to look at httpEntity.getContentType()
to extract the actual encoding of the content, and then pass this to your InputStreamReader
. This means your code will then be able to cope correctly with any encoding.
回答4:
hi as is posted in this other question Special characters in PHP / MySQL
you can set the characters on the php file on the example they set utf-8, but you can set a different type that supports the chararcters you need.
来源:https://stackoverflow.com/questions/9430982/read-non-english-characters-from-http-get-request