问题
Let's see if I can explain this without too many factual errors...
I'm writing a string class and I want it to use utf-8
(stored in a std::string) as it's internal storage.
I want it to be able to take both "normal" std::string
and std::wstring
as input and output.
Working with std::wstring is not a problem, I can use std::codecvt_utf8<wchar_t>
to convert both from and to std::wstring.
However after extensive googling and searching on SO I have yet to find a way to convert between a "normal/default" C++ std::string (which I assume in Windows is using the local system localization?) and an utf-8 std::string.
I guess one option would be to first convert the std::string to an std::wstring using std::codecvt<wchar_t, char>
and then convert it to utf-8 as above, but this seems quite inefficient given that at least the first 128 values of a char should translate straight over to utf-8 without conversion regardless of localization if I understand correctly.
I found this similar question: C++: how to convert ASCII or ANSI to UTF8 and stores in std::string Although I'm a bit skeptic towards that answer as it's hard coded to latin 1 and I want this to work with all types of localization to be on the safe side.
No answers involving boost thanks, I don't want the headache of getting my codebase to work with it.
回答1:
If your "normal string" is encoded using the system's code page and you want to convert it to UTF-8 then this should work:
std::string codepage_str;
int size = MultiByteToWideChar(CP_ACP, MB_COMPOSITE, codepage_str.c_str(),
codepage_str.length(), nullptr, 0);
std::wstring utf16_str(size, '\0');
MultiByteToWideChar(CP_ACP, MB_COMPOSITE, codepage_str.c_str(),
codepage_str.length(), &utf16_str[0], size);
int utf8_size = WideCharToMultiByte(CP_UTF8, 0, utf16_str.c_str(),
utf16_str.length(), nullptr, 0,
nullptr, nullptr);
std::string utf8_str(utf8_size, '\0');
WideCharToMultiByte(CP_UTF8, 0, utf16_str.c_str(),
utf16_str.length(), &utf8_str[0], utf8_size,
nullptr, nullptr);
来源:https://stackoverflow.com/questions/21575310/converting-normal-stdstring-to-utf-8