问题
I am using the ICU library in C++ on OS X. All of my strings are UnicodeStrings, but I need to use system calls like fopen, fread and so forth. These functions take const char* or char* as arguments. I have read that OS X supports UTF-8 internally, so that all I need to do is convert my UnicodeString to UTF-8, but I don't know how to do that.
UnicodeString has a toUTF8() member function, but it returns a ByteSink. I've also found these examples: http://source.icu-project.org/repos/icu/icu/trunk/source/samples/ucnv/convsamp.cpp and read about using a converter, but I'm still confused. Any help would be much appreciated.
回答1:
call UnicodeString::extract(...)
to extract into a char*, pass NULL for the converter to get the default converter (which is in the charset which your OS will be using).
回答2:
ICU User Guide > UTF-8 provides methods and descriptions of doing that.
The simplest way to use UTF-8 strings in UTF-16 APIs is via the C++
icu::UnicodeString
methodsfromUTF8(const StringPiece &utf8)
andtoUTF8String(StringClass &result)
. There is alsotoUTF8(ByteSink &sink)
.
And extract()
is not prefered now.
Note:
icu::UnicodeString
has constructors,setTo()
andextract()
methods which take either a converter object or a charset name. These can be used for UTF-8, but are not as efficient or convenient as thefromUTF8()
/toUTF8()
/toUTF8String()
methods mentioned above.
回答3:
This will work:
std::string utf8;
uStr.toUTF8String(utf8);
来源:https://stackoverflow.com/questions/3150581/unicodestring-to-char-utf-8