The design is quite elegant and pretty much necessary when you consider how referring to an array works at the assembly level. Using x86 assembly, consider the following C code:
void f(int array[]) { return; }
void g(int (*array)[]) { return; }
int main()
{
int a[5];
f(a);
g(&a);
return 0;
}
The array a
will take up 20 bytes on the stack since an int typically takes up 4 bytes on most platforms. With the register EBP
pointing at the base of the stack's activation record, you would be looking at the following assembly for the main()
function above:
//subtract 20 bytes from the stack pointer register ESP for the array
sub esp, 20
//the array is now allocated on the stack
//get the address of the start of the array, and move it into EAX register
lea eax, [ebp - 20]
//push the address contained in EAX onto the stack for the call to f()
//this is pretty much the only way that f() can refer to the array allocated
//in the stack for main()
push eax
call f
//clean-up the stack
pop eax
//get a pointer to the array of int's on the stack
//(so the type is "int (*)[]")
lea eax, [ebp - 20]
//make the function call again using the stack for the function parameters
push eax
call g
//...clean up the stack and return
The assembly command LEA
, or "Load Effective Address", calculates the address from the expression of its second operand and moves it into the register designated by the first operand. So every time we're calling that command, it's like the C-equivalent of the address-of operator. You'll notice that the address where the array starts (i.e., [ebp - 20]
, or 20 bytes subtracted from the base of the stack pointer address located in the reigister EBP
) is what is always passed to each of the functions f
and g
. That's pretty-much the only way it can be done at the machine-code level in order to refer to one chunk of memory allocated in the stack of one function in another function without having to actually copy the contents of the array.
The take-away is that arrays are not the same as pointers, but at the same time, the only effective way to refer to an array on the right-hand side of the assignment operator, or in passing it to a function is to pass it around by reference, which means referring to the array by-name is really, at the machine-level, the exact same as getting a pointer to the array. Therefore at the machine-code level, a
, &a
, and even &a[0]
in these situations devolve into the same set of instructions (in this example case lea eax, [ebp - 20]
. But again, an array-type is not a pointer, and a
, and &a
are not the same type. But since it designates a chunk of memory, the easiest and most effective way to get a reference to it is through a pointer.