How to use UTF-8 in resource properties with ResourceBundle

后端 未结 16 2243
难免孤独
难免孤独 2020-11-22 03:28

I need to use UTF-8 in my resource properties using Java\'s ResourceBundle. When I enter the text directly into the properties file, it displays as mojibake.

相关标签:
16条回答
  • 2020-11-22 03:43

    The ResourceBundle#getBundle() uses under the covers PropertyResourceBundle when a .properties file is specified. This in turn uses by default Properties#load(InputStream) to load those properties files. As per the javadoc, they are by default read as ISO-8859-1.

    public void load(InputStream inStream) throws IOException

    Reads a property list (key and element pairs) from the input byte stream. The input stream is in a simple line-oriented format as specified in load(Reader) and is assumed to use the ISO 8859-1 character encoding; that is each byte is one Latin1 character. Characters not in Latin1, and certain special characters, are represented in keys and elements using Unicode escapes as defined in section 3.3 of The Java™ Language Specification.

    So, you'd need to save them as ISO-8859-1. If you have any characters beyond ISO-8859-1 range and you can't use \uXXXX off top of head and you're thus forced to save the file as UTF-8, then you'd need to use the native2ascii tool to convert an UTF-8 saved properties file to an ISO-8859-1 saved properties file wherein all uncovered characters are converted into \uXXXX format. The below example converts a UTF-8 encoded properties file text_utf8.properties to a valid ISO-8859-1 encoded properties file text.properties.

    native2ascii -encoding UTF-8 text_utf8.properties text.properties

    When using a sane IDE such as Eclipse, this is already automatically done when you create a .properties file in a Java based project and use Eclipse's own editor. Eclipse will transparently convert the characters beyond ISO-8859-1 range to \uXXXX format. See also below screenshots (note the "Properties" and "Source" tabs on bottom, click for large):

    Alternatively, you could also create a custom ResourceBundle.Control implementation wherein you explicitly read the properties files as UTF-8 using InputStreamReader, so that you can just save them as UTF-8 without the need to hassle with native2ascii. Here's a kickoff example:

    public class UTF8Control extends Control {
        public ResourceBundle newBundle
            (String baseName, Locale locale, String format, ClassLoader loader, boolean reload)
                throws IllegalAccessException, InstantiationException, IOException
        {
            // The below is a copy of the default implementation.
            String bundleName = toBundleName(baseName, locale);
            String resourceName = toResourceName(bundleName, "properties");
            ResourceBundle bundle = null;
            InputStream stream = null;
            if (reload) {
                URL url = loader.getResource(resourceName);
                if (url != null) {
                    URLConnection connection = url.openConnection();
                    if (connection != null) {
                        connection.setUseCaches(false);
                        stream = connection.getInputStream();
                    }
                }
            } else {
                stream = loader.getResourceAsStream(resourceName);
            }
            if (stream != null) {
                try {
                    // Only this line is changed to make it to read properties files as UTF-8.
                    bundle = new PropertyResourceBundle(new InputStreamReader(stream, "UTF-8"));
                } finally {
                    stream.close();
                }
            }
            return bundle;
        }
    }
    

    This can be used as follows:

    ResourceBundle bundle = ResourceBundle.getBundle("com.example.i18n.text", new UTF8Control());
    

    See also:

    • Unicode - How to get the characters right?
    0 讨论(0)
  • 2020-11-22 03:43

    As one suggested, i went through implementation of resource bundle.. but that did not help.. as the bundle was always called under en_US locale... i tried to set my default locale to a different language and still my implementation of resource bundle control was being called with en_US... i tried to put log messages and do a step through debug and see if a different local call was being made after i change locale at run time through xhtml and JSF calls... that did not happend... then i tried to do a system set default to a utf8 for reading files by my server (tomcat server).. but that caused pronlem as all my class libraries were not compiled under utf8 and tomcat started to read then in utf8 format and server was not running properly... then i ended up with implementing a method in my java controller to be called from xhtml files.. in that method i did the following:

            public String message(String key, boolean toUTF8) throws Throwable{
                String result = "";
                try{
                    FacesContext context = FacesContext.getCurrentInstance();
                    String message = context.getApplication().getResourceBundle(context, "messages").getString(key);
    
                    result = message==null ? "" : toUTF8 ? new String(message.getBytes("iso8859-1"), "utf-8") : message;
                }catch(Throwable t){}
                return result;
            }
    

    I was particularly nervous as this could slow down performance of my application... however, after implementing this, it looks like as if my application is faster now.. i think it is because, i am now directly accessing the properties instead of letting JSF parse its way into accessing properties... i specifically pass Boolean argument in this call because i know some of the properties would not be translated and do not need to be in utf8 format...

    Now I have saved my properties file in UTF8 format and it is working fine as each user in my application has a referent locale preference.

    0 讨论(0)
  • 2020-11-22 03:44

    For what it's worth my issue was that the files themselves were in the wrong encoding. Using iconv worked for me

    iconv -f ISO-8859-15 -t UTF-8  messages_nl.properties > messages_nl.properties.new
    
    0 讨论(0)
  • 2020-11-22 03:50

    This problem has finally been fixed in Java 9: https://docs.oracle.com/javase/9/intl/internationalization-enhancements-jdk-9

    Default encoding for properties files is now UTF-8.

    Most existing properties files should not be affected: UTF-8 and ISO-8859-1 have the same encoding for ASCII characters, and human-readable non-ASCII ISO-8859-1 encoding is not valid UTF-8. If an invalid UTF-8 byte sequence is detected, the Java runtime automatically rereads the file in ISO-8859-1.

    0 讨论(0)
  • 2020-11-22 03:52

    We create a resources.utf8 file that contains the resources in UTF-8 and have a rule to run the following:

    native2ascii -encoding utf8 resources.utf8 resources.properties
    
    0 讨论(0)
  • 2020-11-22 03:54
    Properties prop = new Properties();
    String fileName = "./src/test/resources/predefined.properties";
    FileInputStream inputStream = new FileInputStream(fileName);
    InputStreamReader reader = new InputStreamReader(inputStream,"UTF-8");
    
    0 讨论(0)
提交回复
热议问题