Consider this sample program:
#include
#include
#include
int main()
{
std::string narrowstr = \"narrow\";
Note that you're using C streams. C streams have a very special quality called "orientation". A stream is either unoriented, wide, or narrow. Orientation is decided by the first output made to any particular stream (see http://en.cppreference.com/w/cpp/io/c for a summary of C I/O streams)
In your case, stdout
starts out unoriented, and by executing the first printf
, you're setting it narrow. Once narrow, it's stuck narrow, and wprintf
fails (check its return code!). The only way to change a C stream is to freopen
it, which doesn't quite work with stdout. That's why 3 and 4 didn't print.
The differences between 1 and 3 is that 1 is a narrow output function which is using narrow string conversion specifier %s: it reads bytes from the char array and sends bytes into a byte stream. 3 is a wide output function with a narrow string conversion specifier %s: it first reads bytes from the char array and mbtowc
s them into wchar_t
s, then sends wchar_t
s into a wide stream, which then wctomb
s them into bytes or multibyte sequences that are then pushed into the standard out with a write
Finally, if widestr is in utf16, you must be using Windows, and all bets are off; there is very little support for anything beyond ASCII on that platform. You may as well give in and use WinAPI (you can get by with standard C++11 for some Unicode things, and even do this C output, with magic words _setmode(_fileno(stdout), _O_U16TEXT);
, that's been discussed enough times)
You need to do:
wprintf(L"3 %hs \n", narrowstr.c_str());
wprintf(L"4 %s \n", widestr.c_str());
Why? Because for printf
, %s says narrow-char-string. For wprintf
, %ls says wide.
But, for wprintf
, %s implies wide, %ls would mean wide itself. %hs would mean narrow (for both). For printf
, %s, in this manner would simply mean %hs
On VC++/Windows, %S
(capital S), would reverse the effect. Therfore for printf("%S")
it would mean wide, and wprintf("%S")
would mean narrow. This is useful for _tprintf
.
The answers to 1 and 2 are in the question are in the documentation. Any good set of documentation will do. They say cppreference is very good.
As for 3, the language standard does not specify any particular encoding for strings, or any particular size of wchar_t
. You need to consult the documentation for your implementation, rather than for the language proper (though writing implementation-dependent code is rarely advisable).