问题
I am wondering how the toupper() function in C works. I am trying it out in the code below but I'm definitely doing something wrong. The code compiles, but the arguments passed into toupper() are not being capitalized...
char **copyArgs(int argc, char **argv) {
char **a = malloc(sizeof(char *) * (argc));
int i;
for(i = 0; i < argc; i++) {
int size = strlen(argv[i]);
a[i] = malloc(sizeof(char) * (size + 1));
strcpy(a[i], argv[i]);
a[i] = toupper(a[i]);
}
return a;
}
If I test this with "one two" it results in "one two", not "ONE TWO". Any advice is appreciated.
回答1:
toupper
converts a single letter to uppercase. In your case, you are passing a pointer to it instead of a char
thanks to C's forgiveness in implicit conversions, so it's obvious that it doesn't work correctly. Probably you are getting an "implicit pointer to integer conversion without a cast" warning: this is a strong sign that you are doing something very wrong.
The whole thing doesn't blow up just because on your platform int
is as big as a pointer (or, at least, big enough for those pointers you are using); toupper
tries to interpret that int
as a character, finds out that it's non-alphabetic and returns it unmodified. That's sheer luck, on other platforms your program would probably crash, because of truncation in the pointer to int
conversion, and because the behavior of toupper
on integers outside the unsigned char
range (plus EOF
) is undefined.
To convert a whole string to uppercase, you have to iterate over all its chars and call toupper
on each of them. You can easily write a function that does this:
void strtoupper(char *str)
{
while(toupper((unsigned char)*str++))
;
}
Notice the unsigned char
cast - all C functions dealing with character categorization and conversion require an int
that is either EOF
(which is left intact) or is the value of an unsigned char
. The reason is sad and complex, and I already detailed it in another answer.
Still, it's worth noting that toupper
by design cannot work reliably with multibyte character encodings (such as UTF-8), so it has no real place in modern text processing (as in general most of the C locale facilities, which were (badly) designed in another era).
来源:https://stackoverflow.com/questions/15057899/toupper-function