printing utf8 in glib

后端 未结 4 834
日久生厌
日久生厌 2021-02-14 19:11

Why utf8 symbols cannot be printed via glib functions?

Source code:

#include \"glib.h\"
#include 

int main() {
    g_print(\"марко\\n\");         


        
4条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-02-14 19:28

    fprint functions assume that every string you print with them is correctly encoded to match the current encoding of your terminal. g_print() does not assume that and will convert the encoding if it thinks that is necessary; of course this is a bad idea, if the encoding was actually correct before, since that will most likely destroy the encoding. What is the locale setting of your terminal?

    You can either set the correct locale by environment variables on most systems or you can do it programatically using the setlocale function. The locale names are system dependent (not part of the POSIX standard), but on most systems the following will work:

    #include 
    
    :
    
    setlocale(LC_ALL, "en_US.utf8");
    

    Instead of LC_ALL you can also only set the locale for certain operations (e.g. "en_US" will cause English number and date formatting, but maybe you don't want numbers/dates to be formatted that way). To quote from the setlocale man page:

    LC_ALL Set the entire locale generically.

    LC_COLLATE Set a locale for string collation routines. This controls alphabetic ordering in strcoll() and strxfrm().

    LC_CTYPE Set a locale for the ctype(3) and multibyte(3) functions. This controls recognition of upper and lower case, alphabetic or non-alphabetic characters, and so on.

    LC_MESSAGES Set a locale for message catalogs, see catopen(3) function.

    LC_MONETARY Set a locale for formatting monetary values; this affects the localeconv() function.

    LC_NUMERIC Set a locale for formatting numbers. This controls the formatting of decimal points in input and output of floating point numbers in functions such as printf() and scanf(), as well as values returned by localeconv().

    LC_TIME Set a locale for formatting dates and times using the strftime() function.

    The only two locale values that are always available on all systems are "C", "POSIX" and "".

    Only three locales are defined by default: the empty string "" (which denotes the native environment) and the "C" and "POSIX" locales (which denote the C-language environment). A locale argument of NULL causes setlocale() to return the current locale. By default, C programs start in the "C" locale. The only function in the library that sets the locale is setlocale(); the locale is never changed as a side effect of some other routine.

提交回复
热议问题