When I was working on string::npos
I noticed something and I couldn\'t find any explanation for it on the web.
(string::npos == ULONG_MAX)
string::npos
is defined as constexpr static std::string::size_type string::npos = -1;
(or if it's defined inside the class definition that would be constexpr static size_type npos = -1;
but that's really irrelevant).
The wraparound of negative numbers converted to unsigned types (std::string::size_type
is basically std::size_t
, which is unsigned) is perfectly well-defined by the Standard. -1
wraps to the largest representable value of the unsigned type, which in your case is 18446744073709551615
. Note that the exact value is implementation-defined because the size of std::size_t
is implementation-defined (but capable of holding the size of the largest possible array on the system in question).
According to the C++ Standard (Document Number: N3337 or Document Number: N4296) std::string::npos
is defined the following way
static const size_type npos = -1;
where std::string::size_type is some unsigned integer type. So there is nothing wonderful that std::string::npos is equal to -1. The initializer is converted to the tyhpe of std::string::npos
.
As for this equation
(string::npos == ULONG_MAX) is true,
then it means that the type std::string::npos
has type in the used implementation unsigned long
. This type is usually corresponds to the type size_t
.
In this equation
(18446744073709551615 == -1)
The left literal has some unsigned integral type that is appropriate to store such a big literal. Thus the right operand is converted also to this unsigned type by propogating the sign bit. As the left operand represents itself the maximum value of the type then they are equal.
18,446,744,073,709,551,615
This number mentioned, 18,446,744,073,709,551,615
, is actually 2^64 − 1
. The important thing here is that 2^64-1
is essentially 0-based 2^64
. The first digit of an unsigned integer is 0
, not 1
. So if the maximum value is 1
, it has two possible values: 0
, or 1
(2).
Let's look at 2^64 - 1
in 64bit binary, all the bits are on.
1111111111111111111111111111111111111111111111111111111111111111b
-1
Let's look at +1
in 64bit binary.
0000000000000000000000000000000000000000000000000000000000000001b
To make it negative in One's Complement (OCP) we invert the bits.
1111111111111111111111111111111111111111111111111111111111111110b
Computers seldom use OCP, they use Two's Complement (TCP). To get TCP, you add one to OCP.
1111111111111111111111111111111111111111111111111111111111111110b (-1 in OCP)
+ 1b (1)
-----------------------------------------------------------------
1111111111111111111111111111111111111111111111111111111111111111b (-1 in TCP)
"But, wait" you ask, if in Twos Complement -1
is,
1111111111111111111111111111111111111111111111111111111111111111b
And, if in binary 2^64 - 1
is
1111111111111111111111111111111111111111111111111111111111111111b
Then they're equal! And, that's what you're seeing. You're comparing a signed 64 bit integer to an unsigned 64bit integer. In C++ that means convert the signed value to unsigned, which the compiler does.
For a technical correction thanks to davmac in the comments, the conversion from -1
which is signed
to an unsigned
type of the same size is actually specified in the language, and not a function of the architecture. That all said, you may find the answer above useful for understanding the arch/languages that support two's compliment but lack the spec to ensure results you can depend on.
This is all about signed overflow and the fact that negative numbers are stored as 2s complement. The means that to get the absolute value of a negative number, you invert all the bits and add one. Meaning when doing an 8 bit comparison 255 and -1 have the same binary value of 11111111. The same applies to bigger integers
https://en.m.wikipedia.org/wiki/Two%27s_complement