How to display accented chars on the console window, that work for all compiler?

问题

I made a program that prints names of bus stations on the screen during running, and these names often contain accented characters. I did a solution that works just fine for me and also for my friend on Visual Studio '13 and '15, but my teacher who corrects the program answered that it doesn't work for him. The file NULL is in the folder that contains the code. I used this:

setlocale(LC_ALL, "");

system("chcp 1250 > NULL");

printf("Mária Terézia körút\n");

My question: How can I make the program display accented chars on every compiler and os?

回答1:

There unfortunately is nothing that works on all compilers, but with a few #if blocks, we can get pretty close. It has been reported that code like this fails on tdm-gcc.

The standard way to do this in C is to print a wide-character string with the standard library. Unfortunately, that doesn’t work with the MSVC runtime without a bit of extra initialization. If you do this, you cannot switch back and forth between the wide-character functions such as wprintf() and the narrow functions such as printf() in the same program.

#include <locale.h>
#include <stdlib.h>
#include <stdio.h>
#include <wchar.h>

/* This has been reported not to autodetect correctly on tdm-gcc. */
#ifndef MS_STDLIB_BUGS // Allow overriding the autodetection.
#  if ( _WIN32 || _WIN64 )
#    define MS_STDLIB_BUGS 1
#  else
#    define MS_STDLIB_BUGS 0
#  endif
#endif

#if MS_STDLIB_BUGS
#  include <io.h>
#  include <fcntl.h>
#endif

void init_locale(void)
// Does magic so that wprintf() can work.
{
  // Constant for fwide().
  static const int wide_oriented = 1;

#if MS_STDLIB_BUGS
  // Windows needs a little non-standard magic.
  static const char locale_name[] = ".1200";
  _setmode( _fileno(stdout), _O_WTEXT );
#else
  // The correct locale name may vary by OS, e.g., "en_US.utf8".
  static const char locale_name[] = "";
#endif

  setlocale( LC_ALL, locale_name );
  fwide( stdout, wide_oriented );
}

int main(void)
{
  init_locale();
  wprintf(L"Mária Terézia körút\n");
  return EXIT_SUCCESS;
}

For compatibility with all recent compilers, you have to save it as UTF-8 with a BOM. (MSVC versions prior to VS 2017 cannot read UTF-8 without the BOM, and clang cannot read anything but UTF-8.)

In order to read the text on the console, you must set the font to a monospaced Unicode font, such as Lucida Console.

On Linux, make sure your locale environment variables are set correctly and that they match the settings of your terminal.

An alternative is to set the console to UTF-8 (On Windows, the command for this is chcp 65001. On Linux, it’s export LANG=en_US.utf8 or the appropriate equivalent from locale -a, and is probably set up by default.) and then printf(u8"Mária Terézia körút\n");. Be warned: UTF-8 is a second-class citizen on Windows.

回答2:

For compatibility with the web (W3C, IETF) and *nix systems (Linux; BSD, Android, OSX, GCC, CLang) use UTF-8 without BOM (as it was designed).

For compatibility with legacy proprietary systems that are still struggling with their unicode migration use UT-8 with BOM. Or admit they're not ready for unicode and use a legacy proprietary 8bit encoding (you will need to migrate to UTF-8 someday so that's just adding code needing fixing to the pile).

Hint: with BOM UTF-8 is not ASCII-compatible, killing any easy migration from 8bit encodings to Unicode.

回答3:

I managed to work out a simple solution that seems to do the trick. Using this makes the accented chars of the Hungarian language working on both Win7 and Win10 and on VS '13 '15 and Code::Blocks.

#include <locale.h>
#if defined(WIN32) || defined(_WIN32)
#include <windows.h>
#endif

In main:

setlocale(LC_ALL, "");
#if defined(WIN32) || defined(_WIN32)
SetConsoleCP(1250); SetConsoleOutputCP(1250);
#endif

来源：https://stackoverflow.com/questions/44224081/how-to-display-accented-chars-on-the-console-window-that-work-for-all-compiler

标签

display

non-ascii-characters