Convert wstring to string encoded in UTF-8

前端 未结 6 1069
囚心锁ツ
囚心锁ツ 2020-11-29 04:19

I need to convert between wstring and string. I figured out, that using codecvt facet should do the trick, but it doesn\'t seem to work for utf-8 locale.

My idea is,

相关标签:
6条回答
  • 2020-11-29 04:31

    What locale does is that it gives the program information about the external encoding, but assuming that the internal encoding didn't change. If you want to output UTF-8 you need to do it from wchar_t not from char*.

    What you could do is output it as raw data (not string), it should be then correctly interpreted if the systems locale is UTF-8.

    Plus when using (w)cout/(w)cerr/(w)cin you need to imbue the locale on the stream.

    0 讨论(0)
  • 2020-11-29 04:32

    The Lexertl library has an iterator that lets you do this:

    std::string str;
    str.assign(
      lexertl::basic_utf8_out_iterator<std::wstring::const_iterator>(wstr.begin()),
      lexertl::basic_utf8_out_iterator<std::wstring::const_iterator>(wstr.end()));
    
    0 讨论(0)
  • 2020-11-29 04:35

    You can use boost's utf_to_utf converter to get char format to store in std::string.

    std::string myresult = boost::locale::conv::utf_to_utf<char>(my_wstring);
    
    0 讨论(0)
  • 2020-11-29 04:39

    The code below might help you :)

    #include <codecvt>
    #include <string>
    
    // convert UTF-8 string to wstring
    std::wstring utf8_to_wstring (const std::string& str)
    {
        std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
        return myconv.from_bytes(str);
    }
    
    // convert wstring to UTF-8 string
    std::string wstring_to_utf8 (const std::wstring& str)
    {
        std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
        return myconv.to_bytes(str);
    }
    
    0 讨论(0)
  • 2020-11-29 04:48

    What's your platform? Note that Windows does not support UTF-8 locales so this may explain why you're failing.

    To get this done in a platform dependent way you can use MultiByteToWideChar/WideCharToMultiByte on Windows and iconv on Linux. You may be able to use some boost magic to get this done in a platform independent way, but I haven't tried it myself so I can't add about this option.

    0 讨论(0)
  • 2020-11-29 04:55

    C++ has no idea of Unicode. Use an external library such as ICU (UnicodeString class) or Qt (QString class), both support Unicode, including UTF-8.

    0 讨论(0)
提交回复
热议问题