Library for converting native2ascii and vice versa

前端 未结 2 1243
一生所求
一生所求 2021-01-05 09:49

I\'m searching for a library (Apache / BSD / EPL licensed) to convert native text to ASCII using \\u for characters not available in ASCII (basically what java.util.Properti

相关标签:
2条回答
  • 2021-01-05 10:29

    Try this piece of code from Apache commons-lang:

    StringEscapeUtils.escapeJava("ایران زیبای من");
    StringEscapeUtils.unescapeJava("\u0627\u06CC\u0631\u0627\u0646 \u0632\u06CC\u0628\u0627\u06CC \u0645\u0646");
    
    0 讨论(0)
  • 2021-01-05 10:47

    You can do this with an CharsetEncoder. You have to read the 'native' Text with the correct encoding to unicode. Than you can use an 'US-ASCII'-encoder to detect, which characters are to be translated into unicode escapes.

    import java.nio.charset.Charset;
    import java.nio.charset.CharsetEncoder;
    
    import org.junit.Test;
    
    public class EncodeToEscapes {
    
    @Test
    public void testEncoding() {
        final String src = "Hallo äöü"; // this has to be read with the right encoding
        final CharsetEncoder asciiEncoder = Charset.forName("US-ASCII").newEncoder();
        final StringBuilder result = new StringBuilder();
        for (final Character character : src.toCharArray()) {
            if (asciiEncoder.canEncode(character)) {
                result.append(character);
            } else {
                result.append("\\u");
                result.append(Integer.toHexString(0x10000 | character).substring(1).toUpperCase());
            }
        }
        System.out.println(result);
     }
    }
    

    Additionally org.apache.commons:commons-lang contains StringEscapeUtils.escapeJava() which can escape and unescape native strings.

    0 讨论(0)
提交回复
热议问题