问题
Are there any plans for adding versions of C standard library string processing functions that are invariant under current locale?
Currently there are lots of fragile workarounds, for example, from jansson/strconv.c:
static void to_locale(strbuffer_t *strbuffer)
{
const char *point;
char *pos;
point = localeconv()->decimal_point;
if(*point == '.') {
/* No conversion needed */
return;
}
pos = strchr(strbuffer->value, '.');
if(pos)
*pos = *point;
}
static void from_locale(char *buffer)
{
const char *point;
char *pos;
point = localeconv()->decimal_point;
if(*point == '.') {
/* No conversion needed */
return;
}
pos = strchr(buffer, *point);
if(pos)
*pos = '.';
}
These functions preprocess its input so it can be used independent of the current locale, under the assumption
- That the delimiter is one byte
- No call to
setlocale
happens between these fix function and the call to any of the affected functions - The string can be modified before conversion
(1) implies that the preprocessing approach breaks on exotic locales (see https://en.wikipedia.org/wiki/Decimal_mark#Hindu.E2.80.93Arabic_numeral_system for examples). (2) implies that the preprocessing approach cannot be threadsafe without a lock, and that lock must be added to the C library. (3) Just stupid.
If it were only possible to specify the locale for a single call to a string-processing function as a parameter, not affecting any other threads, none of these restrictions would apply.
Questions:
- Are there any reports to WG14, or WG21 that address this defect?
- If so, why hasn't these been merged into the standard? It would be nothing more than a new set of functions that take a locale as argument.
- What is the canonical workaround?
Update:
After searching through the Internet, I found the *_l functions, available on FreeBSD, GNU/Linux and MacOSX. Similar functions exists on Windows also. These solve my problem, however these are not in POSIX, which is a superset of C (not really, POSIX relaxes on pointers). So questions 1, and 2 remains open.
回答1:
BSD and macOS Sierra (and Mac OS X before it) support _l
functions that allow you to specify the locale, rather than relying on the current locale. For example:
int fprintf_l(FILE * restrict stream, locale_t loc, const char * restrict format, ...); int printf_l(locale_t loc, const char * restrict format, ...); int snprintf_l(char * restrict str, size_t size, locale_t loc, const char * restrict format, ...); int sprintf_l(char * restrict str, locale_t loc, const char * restrict format, ...);
and:
int fscanf_l(FILE * restrict stream, locale_t loc, const char * restrict format, ...); int scanf_l(locale_t loc, const char * restrict format, ...); int sscanf_l(const char * restrict str, locale_t loc, const char * restrict format, ...);
As a general design, this seems sensible. The type locale_t
is not part of Standard C but is part of POSIX (and defined in <locale.h>
there), and used in <ctype.h>
amongst other places. The BSD man pages say that the header to use is <xlocale.h>
rather than <locale.h>
; this would perhaps be fixed by the standard. Unless there is a major flaw in the design of the BSD functions, these should be a very good basis for any standardization effort, whether that was under POSIX or Standard C.
One issue with the BSD design might be that the locale_t
structure is passed by value, not by (constant restricted) pointer, which is a little surprising. However, it is consistent with the POSIX functions such as:
int isalpha_l(int, locale_t);
A similar scheme might be devised for handling time zone settings, too. There'd be more work in setting that up since there isn't already a time zone type (whereas the locale_t
is part of POSIX already — and could probably be adopted without change into standard C). But, combined with locale settings, it could make the time routines more easily usable in diverse environments from a single executable.
回答2:
sqlite has locale independant printf implementation which is good for your sort of thing as it makes doubles compatible with sql syntax rules. If you can include sqlite as a dependency then that might be a viable option.
来源:https://stackoverflow.com/questions/41794607/locale-invariant-string-processing-with-strtod-strtof-atof-printf