converting String from Windows charset to UTF 8 in Java

こ雲淡風輕ζ 提交于 2019-12-11 09:06:46

问题


so I have to give some arguments to my Java app which is called from a .bat file. Doing this makes the arguments have the system's charset encoding, which makes some characters displayed wrongly. I tried this

     String titulo;

     titulo = new String (args[1].getBytes(),"Cp1252");

also tried with a few others from this list http://docs.oracle.com/javase/1.4.2/docs/guide/intl/encoding.doc.html and none of them succeeded. How else can I encode a String from Windows charset to Java's UTF 8? Thanks a lot in advance!

Regards, Rodrigo.

EDIT: The argument I give in the .bat is Martín and the output (which is a JLabel displaying) shows this MartÝn.


回答1:


The Windows command prompt cmd.exe actually doesn't use CP1252. What it uses apparently depends on the system; on Western European systems it's most likely CP850. So you can try this:

titulo = new String (args[1].getBytes(),"Cp850");

You can look at the code tables for cp850 to check what should happen: í is the byte ED in latin 1 (and, by extension, cp1252), and the byte ED in cp850 is Ý. Therefore: if you print "í" from a Java GUI to cmd.exe it will show up as "Ý".

(But you seem to be seeing the reverse: "í" from the terminal shows up as "Ý" in a GUI.. that doesn't make sense, cmd.exe should pass the byte A1 to Java, which should interprete it as "¡"..)



来源:https://stackoverflow.com/questions/9436188/converting-string-from-windows-charset-to-utf-8-in-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!