The question is how to convert wstring to string?
I have next example :
#include
#include
int main()
{
std::wstr
In my case, I have to use multibyte character (MBCS), and I want to use std::string and std::wstring. And can't use c++11. So I use mbstowcs and wcstombs.
I make same function with using new, delete [], but it is slower then this.
This can help How to: Convert Between Various String Types
EDIT
However, in case of converting to wstring and source string is no alphabet and multi byte string, it's not working. So I change wcstombs to WideCharToMultiByte.
#include <string>
std::wstring get_wstr_from_sz(const char* psz)
{
//I think it's enough to my case
wchar_t buf[0x400];
wchar_t *pbuf = buf;
size_t len = strlen(psz) + 1;
if (len >= sizeof(buf) / sizeof(wchar_t))
{
pbuf = L"error";
}
else
{
size_t converted;
mbstowcs_s(&converted, buf, psz, _TRUNCATE);
}
return std::wstring(pbuf);
}
std::string get_string_from_wsz(const wchar_t* pwsz)
{
char buf[0x400];
char *pbuf = buf;
size_t len = wcslen(pwsz)*2 + 1;
if (len >= sizeof(buf))
{
pbuf = "error";
}
else
{
size_t converted;
wcstombs_s(&converted, buf, pwsz, _TRUNCATE);
}
return std::string(pbuf);
}
EDIT to use 'MultiByteToWideChar' instead of 'wcstombs'
#include <Windows.h>
#include <boost/shared_ptr.hpp>
#include "string_util.h"
std::wstring get_wstring_from_sz(const char* psz)
{
int res;
wchar_t buf[0x400];
wchar_t *pbuf = buf;
boost::shared_ptr<wchar_t[]> shared_pbuf;
res = MultiByteToWideChar(CP_ACP, 0, psz, -1, buf, sizeof(buf)/sizeof(wchar_t));
if (0 == res && GetLastError() == ERROR_INSUFFICIENT_BUFFER)
{
res = MultiByteToWideChar(CP_ACP, 0, psz, -1, NULL, 0);
shared_pbuf = boost::shared_ptr<wchar_t[]>(new wchar_t[res]);
pbuf = shared_pbuf.get();
res = MultiByteToWideChar(CP_ACP, 0, psz, -1, pbuf, res);
}
else if (0 == res)
{
pbuf = L"error";
}
return std::wstring(pbuf);
}
std::string get_string_from_wcs(const wchar_t* pcs)
{
int res;
char buf[0x400];
char* pbuf = buf;
boost::shared_ptr<char[]> shared_pbuf;
res = WideCharToMultiByte(CP_ACP, 0, pcs, -1, buf, sizeof(buf), NULL, NULL);
if (0 == res && GetLastError() == ERROR_INSUFFICIENT_BUFFER)
{
res = WideCharToMultiByte(CP_ACP, 0, pcs, -1, NULL, 0, NULL, NULL);
shared_pbuf = boost::shared_ptr<char[]>(new char[res]);
pbuf = shared_pbuf.get();
res = WideCharToMultiByte(CP_ACP, 0, pcs, -1, pbuf, res, NULL, NULL);
}
else if (0 == res)
{
pbuf = "error";
}
return std::string(pbuf);
}
Solution from: http://forums.devshed.com/c-programming-42/wstring-to-string-444006.html
std::wstring wide( L"Wide" );
std::string str( wide.begin(), wide.end() );
// Will print no problemo!
std::cout << str << std::endl;
Beware that there is no character set conversion going on here at all. What this does is simply to assign each iterated wchar_t
to a char
- a truncating conversion. It uses the std::string c'tor:
template< class InputIt >
basic_string( InputIt first, InputIt last,
const Allocator& alloc = Allocator() );
As stated in comments:
values 0-127 are identical in virtually every encoding, so truncating values that are all less than 127 results in the same text. Put in a chinese character and you'll see the failure.
-
the values 128-255 of windows codepage 1252 (the Windows English default) and the values 128-255 of unicode are mostly the same, so if that's teh codepage you're using most of those characters should be truncated to the correct values. (I totally expected á and õ to work, I know our code at work relies on this for é, which I will soon fix)
And note that code points in the range 0x80 - 0x9F
in Win1252 will not work. This includes €
, œ
, ž
, Ÿ
, ...
There are two issues with the code:
The conversion in const std::string s( ws.begin(), ws.end() );
is not required to correctly map the wide characters to their narrow counterpart. Most likely, each wide character will just be typecast to char
.
The resolution to this problem is already given in the answer by kem and involves the narrow
function of the locale's ctype
facet.
You are writing output to both std::cout
and std::wcout
in the same program. Both cout
and wcout
are associated with the same stream (stdout
) and the results of using the same stream both as a byte-oriented stream (as cout
does) and a wide-oriented stream (as wcout
does) are not defined.
The best option is to avoid mixing narrow and wide output to the same (underlying) stream. For stdout
/cout
/wcout
, you can try switching the orientation of stdout
when switching between wide and narrow output (or vice versa):
#include <iostream>
#include <stdio.h>
#include <wchar.h>
int main() {
std::cout << "narrow" << std::endl;
fwide(stdout, 1); // switch to wide
std::wcout << L"wide" << std::endl;
fwide(stdout, -1); // switch to narrow
std::cout << "narrow" << std::endl;
fwide(stdout, 1); // switch to wide
std::wcout << L"wide" << std::endl;
}
Instead of including locale and all that fancy stuff, if you know for FACT your string is convertible just do this:
#include <iostream>
#include <string>
using namespace std;
int main()
{
wstring w(L"bla");
string result;
for(char x : w)
result += x;
cout << result << '\n';
}
Live example here
As Cubbi pointed out in one of the comments, std::wstring_convert
(C++11) provides a neat simple solution (you need to #include
<locale>
and <codecvt>
):
std::wstring string_to_convert;
//setup converter
using convert_type = std::codecvt_utf8<wchar_t>;
std::wstring_convert<convert_type, wchar_t> converter;
//use converter (.to_bytes: wstr->str, .from_bytes: str->wstr)
std::string converted_str = converter.to_bytes( string_to_convert );
I was using a combination of wcstombs
and tedious allocation/deallocation of memory before I came across this.
http://en.cppreference.com/w/cpp/locale/wstring_convert
update(2013.11.28)
One liners can be stated as so (Thank you Guss for your comment):
std::wstring str = std::wstring_convert<std::codecvt_utf8<wchar_t>>().from_bytes("some string");
Wrapper functions can be stated as so: (Thank you ArmanSchwarz for your comment)
std::wstring s2ws(const std::string& str)
{
using convert_typeX = std::codecvt_utf8<wchar_t>;
std::wstring_convert<convert_typeX, wchar_t> converterX;
return converterX.from_bytes(str);
}
std::string ws2s(const std::wstring& wstr)
{
using convert_typeX = std::codecvt_utf8<wchar_t>;
std::wstring_convert<convert_typeX, wchar_t> converterX;
return converterX.to_bytes(wstr);
}
Note: there's some controversy on whether string
/wstring
should be passed in to functions as references or as literals (due to C++11 and compiler updates). I'll leave the decision to the person implementing, but it's worth knowing.
Note: I'm using std::codecvt_utf8
in the above code, but if you're not using UTF-8 you'll need to change that to the appropriate encoding you're using:
http://en.cppreference.com/w/cpp/header/codecvt
You might as well just use the ctype facet's narrow method directly:
#include <clocale> #include <locale> #include <string> #include <vector> inline std::string narrow(std::wstring const& text) { std::locale const loc(""); wchar_t const* from = text.c_str(); std::size_t const len = text.size(); std::vector<char> buffer(len + 1); std::use_facet<std::ctype<wchar_t> >(loc).narrow(from, from + len, '_', &buffer[0]); return std::string(&buffer[0], &buffer[len]); }