How does this pointer arithmetic work?

问题

#include <stdio.h>

int main(void){
  unsigned a[3][4] = {
    {2,23,6,7},
    {8,5,1,4},
    {12,15,3,9}
 };
 printf("%u",*((int*)(((char*)a)+4)));
 return 0;
}

The output in my machine is the value at a[0][1] i.e 23.Could somebody explain how is this working ?

Edit: Rolling Back to old yucky code,exactly what was presented to me :P

回答1:

So you have your array in memory as so:

2, 23, 6, 7, 8...

What this does is cast the array to a char*, which lets you access individual bytes, and it points here:

2, 23, 6, 7, 8...
^

It then adds four bytes, moving it over to the next value (more on this later).

2, 23, 6, 7, 8...
   ^

Then it turns it into an int* and dereferences it, getting the value 23.

There are technically three things wrong with this code.

The first is that it assumes that an unsigned is 4 bytes in size. (Hence the + 4). But this isn't necessarily true! Better would have been + sizeof(unsigned), ensuring correctness no matter what size unsigned happens to be.

The second problem is the cast to int: the original array was unsigned, but the value is being cast to an int. There exists values in the unsigned range that int cannot represent (because in an int half of the range is in the negatives.) So if one of the values in the array was not representable as an int (meaning the value was greater than INT_MAX), you'd get the wrong value. Better would be to convert to unsigned*, to maintain the correct type.

The last thing is the format specifier. The specifier for integers is %d, but the code uses %u, which is for unsigned integers. In effect, even though casting back to int* was wrong, printf is going to cast that value back into an unsigned*, restoring it's integrity. By fixing problem two, problem three fixes itself.

There is a hidden fourth problem: The code sucks. This may be for learning purposes, but yuck.

回答2:

The array:

unsigned a[3][4] = {
    {2,23,6,7},
    {8,5,1,4},
    {12,15,3,9}
};

is laid out in memory as (assuming a itself is at memory location 0x8000, a particular endian-ness and for a four-byte int):

0x8000  0  0  0  2
0x8004  0  0  0 23
0x8008  0  0  0  6
0x800C  0  0  0  7
0x8010  0  0  0  8
0x8014  0  0  0  5
0x8018  0  0  0 14
0x801C  0  0  0 12
0x8020  0  0  0 15
0x8024  0  0  0  3
0x8028  0  0  0  9

Breaking down the expression:

*((int*)(((char*)a)+4))

((char*)a) gives you a char pointer.
+4 advances that pointer by 4 bytes (4 * sizeof(char))
(int*) turns the result of that back into an int pointer.
* dereferences that pointer to extract the int.

This is a very silly way of doing it since it's inherently non-portable (to environments where an int is two or eight bytes, for example).

回答3:

It first implicitly converts the array a into a pointer to its beginning. Then it casts the pointer to char* and increments the value by 4. The value 4 happens to be the same as sizeof(unsigned) on your system, so actually it has moved one element forward from the beginningn. Then it casts the address to int* and reads the value pointed by it (operator*). This resulting value is printed as unsigned integer, which works because int and unsigned are same size.

The layout of the static 2D array in memory is so that all the elements are actually stored in sequence as a one-dimensional array.

回答4:

unsigned int is of size 4. i.e. sizeof(unsigned) == 4

it can hold 4 chars, each of which is a byte [in C not in Java/C# etc.].

Array is allocated consecutively in memory. When you treat unsigned array as char* you need to move the pointer 4 steps to reach next unsigned value in array.

回答5:

First, you create a 2-dim array with size 3x4.

After ((char*)a) you can work with this as a char array. Let's designate it as b.

((char*)a)+4 is the same as b[4], it points to the 5-th element of char array (you remember, that aarays in C are 0-based). Or just 5-th byte.

When you convert the array back to int, i-th element of int array begins from i*4 byte if sizeof(int) = 4. So, on the 5-th byte the second element of int array begins that's where your pointer points. The compiler gets 4 bytes beginning from 4-th position and says it's int. That's exactly a[0][1].

来源：https://stackoverflow.com/questions/2126421/how-does-this-pointer-arithmetic-work

标签

c++

puzzle