问题
Given this C++11 program, should I expect to see a number or a letter? Or not make expectations?
#include <cstdint>
#include <iostream>
int main()
{
int8_t i = 65;
std::cout << i;
}
Does the standard specify whether this type can or will be a character type?
回答1:
From § 18.4.1 [cstdint.syn] of the C++0x FDIS (N3290), int8_t
is an optional typedef that is specified as follows:
namespace std {
typedef signed integer type int8_t; // optional
//...
} // namespace std
§ 3.9.1 [basic.fundamental] states:
There are five standard signed integer types: “
signed char
”, “short int
”, “int
”, “long int
”, and “long long int
”. In this list, each type provides at least as much storage as those preceding it in the list. There may also be implementation-defined extended signed integer types. The standard and extended signed integer types are collectively called signed integer types....
Types
bool
,char
,char16_t
,char32_t
,wchar_t
, and the signed and unsigned integer types are collectively called integral types. A synonym for integral type is integer type.
§ 3.9.1 also states:
In any particular implementation, a plain
char
object can take on either the same values as asigned char
or anunsigned char
; which one is implementation-defined.
It is tempting to conclude that int8_t
may be a typedef of char
provided char
objects take on signed values; however, this is not the case as char
is not among the list of signed integer types (standard and possibly extended signed integer types). See also Stephan T. Lavavej's comments on std::make_unsigned
and std::make_signed
.
Therefore, either int8_t
is a typedef of signed char
or it is an extended signed integer type whose objects occupy exactly 8 bits of storage.
To answer your question, though, you should not make assumptions. Because functions of both forms x.operator<<(y)
and operator<<(x,y)
have been defined, § 13.5.3 [over.binary] says that we refer to § 13.3.1.2 [over.match.oper] to determine the interpretation of std::cout << i
. § 13.3.1.2 in turn says that the implementation selects from the set of candidate functions according to § 13.3.2 and § 13.3.3. We then look to § 13.3.3.2 [over.ics.rank] to determine that:
- The
template<class traits> basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&, signed char)
template would be called ifint8_t
is an Exact Match forsigned char
(i.e. a typedef ofsigned char
). - Otherwise, the
int8_t
would be promoted toint
and thebasic_ostream<charT,traits>& operator<<(int n)
member function would be called.
In the case of std::cout << u
for u
a uint8_t
object:
- The
template<class traits> basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>&, unsigned char)
template would be called ifuint8_t
is an Exact Match forunsigned char
. - Otherwise, since
int
can represent alluint8_t
values, theuint8_t
would be promoted toint
and thebasic_ostream<charT,traits>& operator<<(int n)
member function would be called.
If you always want to print a character, the safest and most clear option is:
std::cout << static_cast<signed char>(i);
And if you always want to print a number:
std::cout << static_cast<int>(i);
回答2:
int8_t
is exactly 8 bits wide (if it exists).
The only predefined integer types that can be 8 bits are char
, unsigned char
, and signed char
. Both short
and unsigned short
are required to be at least 16 bits.
So int8_t
must be a typedef for either signed char
or plain char
(the latter if plain char
is signed).
If you want to print an int8_t
value as an integer rather than as a character, you can explicitly convert it to int
.
In principle, a C++ compiler could define an 8-bit extended integer type (perhaps called something like __int8
), and make int8_t
a typedef for it. The only reason I can think of to do so would be to avoid making int8_t
a character type. I don't know of any C++ compilers that have actually done this.
Both int8_t
and extended integer types were introduced in C99. For C, there's no particular reason to define an 8-bit extended integer type when the char
types are available.
UPDATE:
I'm not entirely comfortable with this conclusion. int8_t
and uint8_t
were introduced in C99. In C, it doesn't particularly matter whether they're character types or not; there are no operations for which the distinction makes a real difference. (Even putc()
, the lowest-level character output routine in standard C, takes the character to be printed as an int
argument). int8_t
, and uint8_t
, if they're defined, will almost certainly be defined as character types -- but character types are just small integer types.
C++ provides specific overloaded versions of operator<<
for char
, signed char
, and unsigned char
, so that std::cout << 'A'
and std::cout << 65
produce very different output. Later, C++ adopted int8_t
and uint8_t
, but in such a way that, as in C, they're almost certainly character types. For most operations, this doesn't matter any more than it does in C, but for std::cout << ...
it does make a difference, since this:
uint8_t x = 65;
std::cout << x;
will probably print the letter A
rather than the number 65
.
If you want consistent behavior, add a cast:
uint8_t x = 65;
std::cout << int(x); // or static_cast<int>(x) if you prefer
I think the root of the problem is that there's something missing from the language: very narrow integer types that are not character types.
As for the intent, I could speculate that the committee members either didn't think about the issue, or decided it wasn't worth addressing. One could argue (and I would) that the benefits of adding the [u]int*_t
types to the standard outweighs the inconvenience of their rather odd behavior with std::cout << ...
.
回答3:
I'll answer your questions in reverse order.
Does the standard specify whether this type can or will be a character type?
Short answer: int8_t
is signed char
in the most popular platforms (GCC/Intel/Clang on Linux and Visual Studio on Windows) but might be something else in others.
The long answer follows.
Section 18.4.1 of the C++11 Standard provides the synopsis of <cstdint>
which includes the following
typedef
signed integer typeint8_t; //optional
Later in the same section, paragraph 2, it says
The header [
<cstdint>
] defines all functions, types, and macros the same as 7.18 in the C standard.
where C standard means C99 as per 1.1/2:
C ++ is a general purpose programming language based on the C programming language as described in ISO/IEC 9899:1999 Programming languages — C (hereinafter referred to as the C standard).
Hence, the definition of int8_t
is to be found in Section 7.18 of the C99 standard. More precisely, C99's Section 7.18.1.1 says
The
typedef
nameintN_t
designates a signed integer type with widthN
, no padding bits, and a two’s complement representation. Thus, int8_t denotes a signed integer type with a width of exactly 8 bits.
In addition, C99's Section 6.2.5/4 says
There are five standard signed integer types, designated as signed char, short int, int, long int, and long long int. (These and other types may be designated in several additional ways, as described in 6.7.2.) There may also be implementation-defined extended signed integer types. The standard and extended signed integer types are collectively called signed integer types.
Finally, C99's Section 5.2.4.2.1 imposes minimum sizes for standard signed integer types. Excluding signed char
, all others are at least 16 bits long.
Therefore, int8_t
is either signed char
or an 8 bits long extended (non standard) signed integer type.
Both glibc (the GNU C library) and Visual Studio C library define int8_t
as signed char
. Intel and Clang, at least on Linux, also use libc and hence, the same applies to them. Therefore, in the most popular platforms int8_t
is signed char
.
Given this C++11 program, should I expect to see a number or a letter? Or not make expectations?
Short answer: In the most popular platforms (GCC/Intel/Clang on Linux and Visual Studio on Windows) you will certainly see the letter 'A'. In other platforms you might get see 65
though. (Thanks to DyP for pointing this out to me.)
In the sequel, all references are to the C++11 standard (current draft, N3485).
Section 27.4.1 provides the synopsis of <iostream>
, in particular, it states the declaration of cout
:
extern ostream cout;
Now, ostream
is a typedef
for a template specialization of basic_ostream
as per Section 27.7.1:
template <class charT, class traits = char_traits<charT> >
class basic_ostream;
typedef basic_ostream<char> ostream;
Section 27.7.3.6.4 provides the following declaration:
template<class traits>
basic_ostream<char,traits>& operator<<(basic_ostream<char,traits>& out, signed char c);
If int8_t
is signed char
then it's this overload that's going to be called. The same section also specifies that the effect of this call is printing the character (not the number).
Now, let's consider the case where int8_t
is an extended signed integer type. Obviously, the standard doesn't specify overloads of operator<<()
for non standard types but thanks to promotions and convertions one of the provided overloads might accept the call. Indeed, int
is at least 16 bits long and can represent all the values of int8_t
. Then 4.5/1 gives that int8_t
can be promoted to int
. On the other hand, 4.7/1 and 4.7/2 gives that int8_t
can be converted to signed char
. Finally, 13.3.3.1.1 yields that promotion is favored over convertion during overload resolution. Therefore, the following overload (declared in in 23.7.3.1)
basic_ostream& basic_ostream::operator<<(int n);
will be called. This means that, this code
int8_t i = 65;
std::cout << i;
will print 65
.
Update:
1. Corrected the post following DyP's comment.
2. Added the following comments on the possibility of int8_t
be a typedef
for char
.
As said, the C99 standard (Section 6.2.5/4 quoted above) defines 5 standard signed integer types (char
is not one of them) and allows implementations to add their onw which are referred as non standard signed integer types. The C++ standard reinforces that definition in Section 3.9.1/2:
There are five standard signed integer types : “signed char”, “short int”, “int”, “long int”, and “long long int” [...] There may also be implementation-defined extended signed integer types. The standard and extended signed integer types are collectively called signed integer types.
Later, in the same section, paragraph 7 says:
Types
bool
,char
,char16_t
,char32_t
,wchar_t
, and the signed and unsigned integer types are collectively called integral types. A synonym for integral type is integer type.
Therefore, char
is an integer type but char
is neither a signed integer type nor an unsigned integer type and Section 18.4.1 (quoted above) says that int8_t
, when present, is a typedef
for a signed integer type.
What might be confusing is that, depending on the implementation, char
can take the same values as a signed char
. In particular, char
might have a sign but it's still not a signed char
. This is explicitly said in Section 3.9.1/1:
[...] Plain
char
,signed char
, andunsigned char
are three distinct types. [...] In any particular implementation, a plainchar
object can take on either the same values as asigned char
or anunsigned char
; which one is implementation-defined.
This also implies that char
is not a signed integer type as defined by 3.9.1/2.
3. I admit that my interpretation and, specifically, the sentence "char
is neither a signed integer type nor an unsigned integer type" is a bit controversial.
To strength my case, I would like to add that Stephan T. Lavavej said the very same thing here and Johannes Schaub - litb also used the same sentence in a comment on this post.
回答4:
The working draft copy I have, N3376, specifies in [cstdint.syn] § 18.4.1 that the int types are typically typedefs.
namespace std {
typedef signed integer type int8_t; // optional
typedef signed integer type int16_t; // optional
typedef signed integer type int32_t; // optional
typedef signed integer type int64_t; // optional
typedef signed integer type int_fast8_t;
typedef signed integer type int_fast16_t;
typedef signed integer type int_fast32_t;
typedef signed integer type int_fast64_t;
typedef signed integer type int_least8_t;
typedef signed integer type int_least16_t;
typedef signed integer type int_least32_t;
typedef signed integer type int_least64_t;
typedef signed integer type intmax_t;
typedef signed integer type intptr_t; // optional
typedef unsigned integer type uint8_t; // optional
typedef unsigned integer type uint16_t; // optional
typedef unsigned integer type uint32_t; // optional
typedef unsigned integer type uint64_t; // optional
typedef unsigned integer type uint_fast8_t;
typedef unsigned integer type uint_fast16_t;
typedef unsigned integer type uint_fast32_t;
typedef unsigned integer type uint_fast64_t;
typedef unsigned integer type uint_least8_t;
typedef unsigned integer type uint_least16_t;
typedef unsigned integer type uint_least32_t;
typedef unsigned integer type uint_least64_t;
typedef unsigned integer type uintmax_t;
typedef unsigned integer type uintptr_t; // optional
} // namespace std
Since the only requirement made is that it must be 8 bits, then typedef to a char is acceptable.
回答5:
char
/signed char
/unsigned char
are three different types, and a char
is not always 8 bits. on most platform they are all 8-bits integer, but std::ostream only defined char version of >>
for behaviors like scanf("%c", ...)
.
来源:https://stackoverflow.com/questions/15911714/are-int8-t-and-uint8-t-intended-to-be-char-types