Hi I was trying to output unicode string to a console with iostreams and failed.
I found this: Using unicode font in c++ console app and this snippet work
I don't think there is an easy answer. looking at Console Code Pages and SetConsoleCP Function it seems that you will need to set-up an appropriate codepage for the character-set you're going to output.
Recenly I wanted to stream unicode from Python to windows console and here is the minimum I needed to make:
chcp 65001
in the Console or use the corresponding method in the C++ codeLook through an interesing article about java unicode on windows console
Besides, in Python you can not write to default sys.stdout in this case, you will need to substitute it with something using os.write(1, binarystring) or direct call to a wrapper around WriteConsoleW. Seems like in C++ you will need to do the same.
There are a few issues with the mswcrt and io streams.
Windows console supports UNICODE with the ReadConsole and WriteConsole functions in UTF-16LE mode. Background effect - piping in this case will not work. I.e. myapp.exe >> ret.log brings to 0 byte ret.log file. If you are ok with this fact you can try my library as following.
const char* umessage = "Hello!\nПривет!\nПривіт!\nΧαιρετίσματα!\nHelló!\nHallå!\n";
...
#include <console.hpp>
#include <ios>
...
std::ostream& cout = io::console::out_stream();
cout << umessage
<< 1234567890ull << '\n'
<< 123456.78e+09 << '\n'
<< 12356.789e+10L << '\n'
<< std::hex << 0xCAFEBABE
<< std::endl;
Library will auto-convert your UTF-8 into UTF-16LE and write it into console using WriteConsole. As well as there are error and input streams. Another library benefit - colors.
Link on example app: https://github.com/incoder1/IO/tree/master/examples/iostreams
The library homepage: https://github.com/incoder1/IO
Screenshot:
chcp
to find which codepage works for you. In my case it was chcp 28591
for Western Europe.REG ADD HKCU\Console /v CodePage /t REG_DWORD /d 28591
I had a similar problem, with Java. It is just cosmetic, since it involves log lines sent to the console; but it is still annoying.
The output from our Java application is supposed to be in UTF-8 and it displays correctly in eclipse's console. But in windows console, it just shows the ASCII box-drawing characters: Inicializaci├│n
and artículos
instead of Inicialización
and artículos
.
I stumbled upon a related question and mixed some of the answers to get to the solution that worked for me. The solution is changing the codepage used by the console and using a font that supports UNICODE (like consolas
or lucida console
). The font you can select in the system menu of the Windows cosole:
Win + R
then type cmd
and hit the Return
key.Win
key and type cmd
followed by the return
key.Alt + Space
key combinationConsolas
or Lucida console
OK
Regarding the codepage, for a one-off case, you can get it done with the command chcp
and then you have to investigate which codepage is correct for your set of characters. Several answers suggested UTF-8 codepage, which is 65001, but that codepage didn't work for my Spanish characters.
Another answer suggested a batch script to interactively selecting the codepage you wanted from a list. There I found the codepage for ISO-8859-1 I needed: 28591. So you could execute
chcp 28591
before each execution of your application. You might check which code page is right for you in the Code Page Identifiers MSDN page.
Yet another answer indicated how to persist the selected codepage as the default for your windows console. It involves changing the registry, so consider yourself warned that you might brick your machine by using this solution.
REG ADD HKCU\Console /v CodePage /t REG_DWORD /d 28591
This creates the CodePage
value with the 28591
data inside the HKCU\Console registry key. And that did work for me.
Please note that HKCU ("HKEY_CURRENT_USER") is only for the current user. If you want to change it for all users in that computer, you'll need to use the regedit
utility and find/create the corresponding Console
key (probably you'll have to create a Console
key inside HKEY_USERS\.DEFAULT
)
Here is a Hello World in Chinese. Actually it is just "Hello". I tested this on Windows 10, but I think it might work since Windows Vista. Before Windows Vista it will be hard, if you want a programmatic solution, instead of configuring the console / registry etc. Maybe have a look here if you really need to do this on Windows 7: Change console Font Windows 7
I dont want to claim this is the only solution, but this is what worked for me.
std::wcout
I am using Visual Studio 2017 CE. I created a blank console app. The default settings are alright. But if you experience problems or you use a different ide you might want to check these:
In your project properties find configuration properties -> General -> Project Defaults -> Character Set. It should be "Use Unicode Character Set" not "Multi-Byte".
This will define _UNICODE
and UNICODE
preprocessor macros for you.
int wmain(int argc, wchar_t* argv[])
Also I think we should use wmain
function instead of main
. They both work, but in a unicode environment wmain
may be more convenient.
Also my source files are UTF-16-LE encoded, which seems to be the default in Visual Studio 2017.
This is quite obvious. We need the unicode codepage in the console.
If you want to check your default codepage, just open a console and type chcp
withou any arguments.
We have to change it to 65001, which is the UTF-8 codepage. Windows Codepage Identifiers
There is a preprocessor macro for that codepage: CP_UTF8
.
I needed to set both, the input and output codepage. When I omitted either one, the output was incorrect.
SetConsoleOutputCP(CP_UTF8);
SetConsoleCP(CP_UTF8);
You might also want to check the boolean return values of those functions.
Until yet I didnt find a console font that supports every character. So I had to choose one. If you want to output characters which are partly only available in one font and partly in another font, then I believe it is impossible to find a solution. Only maybe if there is a font out there that supports every character. But also I didnt look into how to install a font.
I think it is not possible to use two different fonts in the same console window at the same time.
How to find a compatible font? Open your console, go to the properties of the console window by clicking on the icon in the upper left of the window. Go to the fonts tab and choose a font and click ok. Then try to enter your characters in the console window. Repeat this until you find a font you can work with. Then note down the name of the font.
Also you can change the size of the font in the properties window. If you found a size you are happy with, note down the size values that are displayed in the properties window in the section "selected font". It will show width and height in pixels.
To actually set the font programmatically you use:
CONSOLE_FONT_INFOEX fontInfo;
// ... configure fontInfo
SetCurrentConsoleFontEx(hConsole, false, &fontInfo);
See my example at the end of this answer for details. Or look it up in the fine manual: SetCurrentConsoleFont. This function only exists since Windows Vista.
You will need to set the locale to the locale of the language which characters you want to print.
char* a = setlocale(LC_ALL, "chinese");
The return value is interesting. It will contain a string to describe exactly wich locale was chosen.
Just give it a try :-)
I tested with chinese
and german
.
More info: setlocale
Not much to say here. If you want to output wide characters, use this for example:
std::wcout << L"你好" << std::endl;
Oh, and dont forget the L
prefix for wide characters!
And if you type literal unicode characters like this in the source file, the source file must be unicode encoded. Like the default in Visual Studio is UTF-16-LE. Or maybe use notepad++ and set the encoding to UCS-2 LE BOM
.
Finally I put it all together as an example:
#include <Windows.h>
#include <iostream>
#include <io.h>
#include <fcntl.h>
#include <locale.h>
#include <wincon.h>
int wmain(int argc, wchar_t* argv[])
{
SetConsoleTitle(L"My Console Window - 你好");
HANDLE hConsole = GetStdHandle(STD_OUTPUT_HANDLE);
char* a = setlocale(LC_ALL, "chinese");
SetConsoleOutputCP(CP_UTF8);
SetConsoleCP(CP_UTF8);
CONSOLE_FONT_INFOEX fontInfo;
fontInfo.cbSize = sizeof(fontInfo);
fontInfo.FontFamily = 54;
fontInfo.FontWeight = 400;
fontInfo.nFont = 0;
const wchar_t myFont[] = L"KaiTi";
fontInfo.dwFontSize = { 18, 41 };
std::copy(myFont, myFont + (sizeof(myFont) / sizeof(wchar_t)), fontInfo.FaceName);
SetCurrentConsoleFontEx(hConsole, false, &fontInfo);
std::wcout << L"Hello World!" << std::endl;
std::wcout << L"你好!" << std::endl;
return 0;
}
Cheers !
I had a similar problem, Output Unicode to console Using C++, in Windows contains the gem that you need to do chcp 65001
in the console before running your program.
There may be some way of doing this programatically, but I don't know what it is.