Output UTF-16? A little stuck

前端 未结 2 1263
青春惊慌失措
青春惊慌失措 2021-01-21 17:54

I have some UTF-16 encoded characters in their surrogate pair form. I want to output those surrogate pairs as characters on the screen.

Does anyone know how this is poss

相关标签:
2条回答
  • 2021-01-21 18:02

    iconv('UTF-16', 'UTF-8', yourString)

    0 讨论(0)
  • 2021-01-21 18:21

    Your question is a little unclear.

    If you have ASCII text with embedded UTF-16 escape sequences, you can convert everything to UTF-8 in this way:

    function unescape_utf16($string) {
        /* go for possible surrogate pairs first */
        $string = preg_replace_callback(
            '/\\\\u(D[89ab][0-9a-f]{2})\\\\u(D[c-f][0-9a-f]{2})/i',
            function ($matches) {
                $d = pack("H*", $matches[1].$matches[2]);
                return mb_convert_encoding($d, "UTF-8", "UTF-16BE");
            }, $string);
        /* now the rest */
        $string = preg_replace_callback('/\\\\u([0-9a-f]{4})/i',
            function ($matches) {
                $d = pack("H*", $matches[1]);
                return mb_convert_encoding($d, "UTF-8", "UTF-16BE");
            }, $string);
        return $string;
    }
    
    $string = '\uD869\uDED6';
    echo unescape_utf16($string);
    

    which gives the character

    0 讨论(0)
提交回复
热议问题