Unsigned integer as UTF-8 value

后端 未结 4 1214
故里飘歌
故里飘歌 2021-02-03 16:46

assuming that I have

uint32_t a(3084);

I would like to create a string that stores the unicode character U+3084 which means that I

4条回答
  •  醉梦人生
    2021-02-03 17:09

    Here's some C++ code that wouldn't be hard to convert to C. Adapted from an older answer.

    std::string UnicodeToUTF8(unsigned int codepoint)
    {
        std::string out;
    
        if (codepoint <= 0x7f)
            out.append(1, static_cast(codepoint));
        else if (codepoint <= 0x7ff)
        {
            out.append(1, static_cast(0xc0 | ((codepoint >> 6) & 0x1f)));
            out.append(1, static_cast(0x80 | (codepoint & 0x3f)));
        }
        else if (codepoint <= 0xffff)
        {
            out.append(1, static_cast(0xe0 | ((codepoint >> 12) & 0x0f)));
            out.append(1, static_cast(0x80 | ((codepoint >> 6) & 0x3f)));
            out.append(1, static_cast(0x80 | (codepoint & 0x3f)));
        }
        else
        {
            out.append(1, static_cast(0xf0 | ((codepoint >> 18) & 0x07)));
            out.append(1, static_cast(0x80 | ((codepoint >> 12) & 0x3f)));
            out.append(1, static_cast(0x80 | ((codepoint >> 6) & 0x3f)));
            out.append(1, static_cast(0x80 | (codepoint & 0x3f)));
        }
        return out;
    }
    

提交回复
热议问题