Can anyone explain the logic how to add a
and b
?
#include
int main()
{
int a=30000, b=20, sum;
char *p;
int a=30000, b=20, sum;
char *p; //1. p is a pointer to char
p = (char *) a;
a
is of type int
, and has the value 30000
. The above assignment converts the value 30000
from int
to char*
and stores the result in p
.
The semantics of converting integers to pointers are (partially) defined by the C standard. Quoting the N1570 draft, section 6.3.2.3 paragraph 5:
An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
with a (non-normative) footnote:
The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.
The standard makes no guarantees about the relative sizes of types int
and char*
; either could be bigger than the other, and the conversion could lose information. The result of this particular conversions is very unlikely to be a valid pointer value. If it's a trap representation, then the behavior of the assignment is undefined.
On a typical system you're likely to be using, char*
is at least as big as int
, and integer-to-pointer conversions probably just reinterpret the bits making up the integer's representation as the representation of a pointer value.
sum = (int)&p[b];
p[b]
is by definition equivalent to *(p+b)
, where the +
denotes pointer arithmetic. Since the pointer points to char
, and a char
is by definition 1 byte, the addition advances the pointed-to address by b
bytes in memory (in this case 20).
But p
is probably not a valid pointer, so any attempt to perform arithmetic on it, or even to access its value, has undefined behavior.
In practice, most C compilers generate code that doesn't perform extra checks. The emphasis is on fast execution of correct code, not on detection of incorrect code. So if the previous assignment to p
set it to an address corresponding to the number 30000
, then adding b
, or 20, to that address will probably yield an address corresponding to the number 30020
.
That address is the result of (p+b)
; now the []
operator implicitly applies the *
operator to that address, giving you the object that that address points to -- conceptually, this is a char
object stored at an address corresponding to the integer 30020
.
We immediately apply the &
operator to that object. There's a special-case rule that says applying &
to the result of a []
operator is equivalent to just doing the pointer addition; see 6.5.3.2p2 in the above referenced standard draft.
So this:
&p[b]
is equivalent to:
p + b
which, as I said above, yields an address (of type char*
) corresponding to the integer value 30020
-- assuming, of course, that integer-to-pointer conversions behave in a certain way and that the undefined behavior of constructing and accessing an invalid pointer value don't do anything surprising.
Finally, we use a cast operator to convert this address to type int
. Conversion of a pointer value to an integer is also implementation-defined, and possibly undefined. Quoting 6.3.2.3p6:
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.
It's not uncommon for a char*
to be bigger than an int
(for example, I'm typing this on a system with 32-bit int
and 64-bit char*
). But we're relatively safe from overflow in this case, because the char*
value is the result of converting an in-range int
value. there's no guarantee that converting a given value from int
to char*
and back to int
will yield the original result, but it commonly works that way, at least for values that are in range.
So if a number of implementation-specific assumptions happen to be satisfied by the implementation on which the code happens to be running, then this code is likely to yield the same result as 30000 + 20
.
Incidentally, I've worked on a system where this would have failed. The Cray T90 was a word-addressed machine, with hardware addresses pointing to 64-bit words; there was no hardware support for byte addressing. But char
was 8 bits, so char*
and void*
pointers had to be constructed and manipulated in hardware. A char*
pointer consisted of a 64-bit word pointer with a byte offset stored in the otherwise unused high-order 3 bits. Conversions between pointers and integers did not treat these high-order bits specially; they were simply copied. So ptr + 1
and (char*)(int)ptr + 1)
could yield very different results.
But hey, you've managed to add two small integers without using the +
operator, so there's that.