Creating Unicode character from its number

后端 未结 13 1729
挽巷
挽巷 2020-11-28 21:38

I want to display a Unicode character in Java. If I do this, it works just fine:

String symbol = \"\\u2202\";

symbol is equal to \"∂\". That\'

相关标签:
13条回答
  • 2020-11-28 22:19

    This is how you do it:

    int cc = 0x2202;
    char ccc = (char) Integer.parseInt(String.valueOf(cc), 16);
    final String text = String.valueOf(ccc);
    

    This solution is by Arne Vajhøj.

    0 讨论(0)
  • 2020-11-28 22:20

    Remember that char is an integral type, and thus can be given an integer value, as well as a char constant.

    char c = 0x2202;//aka 8706 in decimal. \u codepoints are in hex.
    String s = String.valueOf(c);
    
    0 讨论(0)
  • 2020-11-28 22:21

    Unfortunatelly, to remove one backlash as mentioned in first comment (newbiedoodle) don't lead to good result. Most (if not all) IDE issues syntax error. The reason is in this, that Java Escaped Unicode format expects syntax "\uXXXX", where XXXX are 4 hexadecimal digits, which are mandatory. Attempts to fold this string from pieces fails. Of course, "\u" is not the same as "\\u". The first syntax means escaped 'u', second means escaped backlash (which is backlash) followed by 'u'. It is strange, that on the Apache pages is presented utility, which doing exactly this behavior. But in reality, it is Escape mimic utility. Apache has some its own utilities (i didn't testet them), which do this work for you. May be, it is still not that, what you want to have. Apache Escape Unicode utilities But this utility 1 have good approach to the solution. With combination described above (MeraNaamJoker). My solution is create this Escaped mimic string and then convert it back to unicode (to avoid real Escaped Unicode restriction). I used it for copying text, so it is possible, that in uencode method will be better to use '\\u' except '\\\\u'. Try it.

      /**
       * Converts character to the mimic unicode format i.e. '\\u0020'.
       * 
       * This format is the Java source code format.
       * 
       *   CharUtils.unicodeEscaped(' ') = "\\u0020"
       *   CharUtils.unicodeEscaped('A') = "\\u0041"
       * 
       * @param ch  the character to convert
       * @return is in the mimic of escaped unicode string, 
       */
      public static String unicodeEscaped(char ch) {
        String returnStr;
        //String uniTemplate = "\u0000";
        final static String charEsc = "\\u";
    
        if (ch < 0x10) {
          returnStr = "000" + Integer.toHexString(ch);
        }
        else if (ch < 0x100) {
          returnStr = "00" + Integer.toHexString(ch);
        }
        else if (ch < 0x1000) {
          returnStr = "0" + Integer.toHexString(ch);
        }
        else
          returnStr = "" + Integer.toHexString(ch);
    
        return charEsc + returnStr;
      }
    
      /**
       * Converts the string from UTF8 to mimic unicode format i.e. '\\u0020'.
       * notice: i cannot use real unicode format, because this is immediately translated
       * to the character in time of compiling and editor (i.e. netbeans) checking it
       * instead reaal unicode format i.e. '\u0020' i using mimic unicode format '\\u0020'
       * as a string, but it doesn't gives the same results, of course
       * 
       * This format is the Java source code format.
       * 
       *   CharUtils.unicodeEscaped(' ') = "\\u0020"
       *   CharUtils.unicodeEscaped('A') = "\\u0041"
       * 
       * @param String - nationalString in the UTF8 string to convert
       * @return is the string in JAVA unicode mimic escaped
       */
      public String encodeStr(String nationalString) throws UnsupportedEncodingException {
        String convertedString = "";
    
        for (int i = 0; i < nationalString.length(); i++) {
          Character chs = nationalString.charAt(i);
          convertedString += unicodeEscaped(chs);
        }
        return convertedString;
      }
    
      /**
       * Converts the string from mimic unicode format i.e. '\\u0020' back to UTF8.
       * 
       * This format is the Java source code format.
       * 
       *   CharUtils.unicodeEscaped(' ') = "\\u0020"
       *   CharUtils.unicodeEscaped('A') = "\\u0041"
       * 
       * @param String - nationalString in the JAVA unicode mimic escaped
       * @return is the string in UTF8 string
       */
      public String uencodeStr(String escapedString) throws UnsupportedEncodingException {
        String convertedString = "";
    
        String[] arrStr = escapedString.split("\\\\u");
        String str, istr;
        for (int i = 1; i < arrStr.length; i++) {
          str = arrStr[i];
          if (!str.isEmpty()) {
            Integer iI = Integer.parseInt(str, 16);
            char[] chaCha = Character.toChars(iI);
            convertedString += String.valueOf(chaCha);
          }
        }
        return convertedString;
      }
    
    0 讨论(0)
  • 2020-11-28 22:23

    This one worked fine for me.

      String cc2 = "2202";
      String text2 = String.valueOf(Character.toChars(Integer.parseInt(cc2, 16)));
    

    Now text2 will have ∂.

    0 讨论(0)
  • 2020-11-28 22:26

    (ANSWER IS IN DOT NET 4.5 and in java, there must be a similar approach exist)

    I am from West Bengal in INDIA. As I understand your problem is ... You want to produce similar to ' অ ' (It is a letter in Bengali language) which has Unicode HEX : 0X0985.

    Now if you know this value in respect of your language then how will you produce that language specific Unicode symbol right ?

    In Dot Net it is as simple as this :

    int c = 0X0985;
    string x = Char.ConvertFromUtf32(c);
    

    Now x is your answer. But this is HEX by HEX convert and sentence to sentence conversion is a work for researchers :P

    0 讨论(0)
  • 2020-11-28 22:27

    char c=(char)0x2202; String s=""+c;

    0 讨论(0)
提交回复
热议问题