I\'m working on a native extension for a zinc based flash application and I need to convert a const char*
to a wstring
.
This is my code:
AFAIK this works only from C++11 and above:
#include <codecvt>
// ...
std::wstring stringToWstring(const std::string& t_str)
{
//setup converter
typedef std::codecvt_utf8<wchar_t> convert_type;
std::wstring_convert<convert_type, wchar_t> converter;
//use converter (.to_bytes: wstr->str, .from_bytes: str->wstr)
return converter.from_bytes(t_str);
}
Reference answer
I recommend you using std::string
instead of C-style strings (char*
) wherever possible. You can create std::string
object from const char*
by simple passing it to its constructor.
Once you have std::string
, you can create simple function that will convert std::string
containing multi-byte UTF-8 characters to std::wstring
containing UTF-16 encoded points (16bit representation of special characters from std::string
).
There are more ways how to do that, here's the way by using MultiByteToWideChar function:
std::wstring s2ws(const std::string& str)
{
int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);
std::wstring wstrTo( size_needed, 0 );
MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);
return wstrTo;
}
Check these questions too:
Mapping multibyte characters to their unicode point representation
Why use MultiByteToWideCharArray to convert std::string to std::wstring?
On OS X wstring uses UTF-32 rather than UTF-16. You can do the conversion like this:
#include <codecvt>
#include <string>
// make facets usable by giving them a public destructor
template <class Facet>
class usable_facet
: public Facet
{
public:
template <class ...Args>
usable_facet(Args&& ...args)
: Facet(std::forward<Args>(args)...) {}
~usable_facet() {}
};
std::wstring s2ws(std::string const &s) {
std::wstring_convert<
usable_facet<std::codecvt<char32_t,char,std::mbstate_t>>
,char32_t> convert;
std::u32string utf32 = convert.from_bytes(s);
static_assert(sizeof(wchar_t)==sizeof(char32_t),"char32_t and wchar_t must have same size");
return {begin(utf32),end(utf32)};
}
An addition to the answer from @anhoppe. Here's how to convert char*
:
#include <codecvt>
#include <locale>
// ...
std::wstring stringToWstring(const char* utf8Bytes)
{
//setup converter
using convert_type = std::codecvt_utf8<typename std::wstring::value_type>;
std::wstring_convert<convert_type, typename std::wstring::value_type> converter;
//use converter (.to_bytes: wstr->str, .from_bytes: str->wstr)
return converter.from_bytes(utf8Bytes);
}
And here's how to convert char*
if you also already know the length of the buffer:
#include <codecvt>
// ...
std::wstring stringToWstring(const char* utf8Bytes, const size_t numBytes)
{
//setup converter
using convert_type = std::codecvt_utf8<typename std::wstring::value_type>;
std::wstring_convert<convert_type, typename std::wstring::value_type> converter;
//use converter (.to_bytes: wstr->str, .from_bytes: str->wstr)
return converter.from_bytes(utf8Bytes, utf8Bytes + numBytes);
}
Here's a code I found;
std::wstring StringToWString(const std::string& s)
{
std::wstring temp(s.length(),L' ');
std::copy(s.begin(), s.end(), temp.begin());
return temp;
}
And here's the original forum post with a possible second solution using the windows API function MultiByteToWideChar:
http://forums.codeguru.com/archive/index.php/t-193852.html
You need a library that can encode/decode UTF8. Unfortunately, this functionality isn't included with the std c++ library. Here's one library you might use: http://utfcpp.sourceforge.net/
Here's an example use of it:
utf8::utf8to32(bytes.begin(), bytes.end(), std::back_inserter(wstr));