Extended Ascii doesn't work in console!

荒凉一梦 提交于 2019-12-05 21:50:49
Josh Lee

See this question. When Java’s default character encoding is not UTF-8 — as is the case, it seems, on Windows and OS X, but not Linux — then characters which fail to encode are converted to question marks. You can pass the correct switch (-Dfile.encoding=UTF-8 on some terminals, but I don’t have a Windows box in front of me) to the JVM’s command line, or you can set an environment variable. Portably determining what this should be might be impossible, but if you know that you will always run on the Win32 console, for example, you can choose a Charset to explicitly encode the characters before writing them to standard output, or you can directly write the bytes you need.

The Windows command prompt uses old DOS OEM encodings by default. System.out uses the default system encoding, which will be a Windows "ANSI" encoding. However, System.console() detects the encoding of the console.

U+255A (╚) is more likely to be supported by the OEM codepages as these ranges were used for accented characters in Windows.

You can read more here, here, here and here.

Personally, I would avoid the -Dfile.encoding option with codepage 65001 as this produces unintended side-effects in both the console (batch files stop working) and Java (bugs).

In case you are using Windows, the console is not UTF-8 but UTF-16 which is the same native encoding that Java uses, therefore you should be able to print wide character strings directly.

I'm not a Java programmer but in the case of C you have to call _setmode() with the special mode _O_U16TEXT before printing UTF-16 will actually work.

If you want to print multibyte character strings instead you can set the Windows console to UTF-8 from the commandline with chcp 65001 or programmatically from the Win32 API SetConsoleOutputCP() but beware a bug where WriteFile() returns the number of characters written instead of the number of bytes written as is documented. This bug causes UTF-8 on the Windows console to be corrupt from Perl, PHP and Ruby. I believe even MSVCRT even falls victim.

Good luck!

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!