Compiling (javac) a UTF8 encoded Java source code with a BOM

廉价感情. 提交于 2019-12-02 23:06:19

Trim the BOM and then use javac -encoding utf8 x.java

This isn't a problem with your text editor, it's a problem with javac ! The Unicode spec says BOM is optionnal in UTF-8, it doesn't say it's forbidden ! If a BOM can be there, then javac HAS to handle it, but it doesn't. Actually, using the BOM in UTF-8 files IS useful to distinguish an ANSI-coded file from an Unicode-coded file.

The proposed solution of removing the BOM is only a workaround and not the proper solution.

This bug report indicates that this "problem" will never be fixed : http://bugs.java.com/view_bug.do?bug_id=4508058

Since this thread is in the top 2 google results for the "javac BOM" search, I'm leaving this here for future readers.

https://stackoverflow.com/a/28043356/7050261

Actually, using the BOM in UTF-8 files IS useful to distinguish an ANSI-coded file from an Unicode-coded file.

Actually

  • BOM is not about distinguishing ANSI and Unicode. Do not use a feature on purpose it is not designed for.

  • UTF-8 was designed to be backward-compatible with ANSI intentionally, so a lot of code written to process formatted text relied on 0..127 bytes only (XML, JSON, etc.) should work correctly with UTF-8 encoded text without any modifications.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!