How are the names of arrays stored in memory?
For example if I write:
char arr[10];
the array items are stored in memory starting f
If you mean the literal name that you declared it with, that is generally compiled out unless you compiled with debugging symbols -g
.
If you declared something on the stack, the machine code generally refers to the array elements as an offset from the frame pointer $ebp
.
To answer your implied question, for an array on the stack, it does not exist. The machine code does not know what arr
is. It only knows that there is a region of memory that you are addressing using offsets (indices in your code).
The question is more meaningful for an array on the heap, because now you have a pointer (which has its own address), and the memory which holds the actual array (which is on the heap, and stored inside the pointer).
Consider the following code.
char* arr = malloc(5);
Assuming you compiled with debug symbols, if you look at &arr
inside gdb
, you will see the address where the pointer arr
is stored.
You can demonstrate the same thing if you create a separate pointer to an array on the stack.
char arr[10];
char* ptr = arr;
Here, you will see that ptr
has separate storage (p &ptr
), and holds the address of arr
as its value, but &arr
itself is equal to the address of its first element.
arr itself but also arr[0], arr[1]....arr[9] are all without addresses themselves.
arr
is simply the base address of the array. Individual elements are located by adding offsets to that address. The name of the array isn't useful, except for debugging, once the code is compiled.
As others stated, the name of your variables are symbols in your source code. They are not translated into machine code by the compiler and therefore are nowhere to be found, although they can be attached to your executable using appropriate options (for example -g
in gcc, which will produce debugging information in the operating system's native format) for debugging purposes.
In the machine code, variables are referenced by memory locations, which can be various, depending on the machine and on the compiler:
functions parameters and local variables will depend on calling conventions and optimizations, and are typically (on x86) addresses computed from the special register %ebp (known as the base pointer) and an offset. They can also be passed directly as registers (for instance the first parameter can be translated to a general-purpose register), or as offset from a different, special address (specific to the function) in memory.
When dereferencing a pointer, the address for the variable is obtained indirectly from the memory (and may be only available at runtime if using dynamic memory allocations such as malloc
).
Bottom line, in the machine code, you usually won't find variable names (when compiled from C). Instead, you have registers and addresses (or, to be precise, numeric constants and computed values that you have to guess are addresses), which is part of why understanding the original meaning of a code only by looking at the binary is a difficult problem (notwithstanding the problems posed by trying to decompile a binary to assembly code).
The name of the array is lost when you compile it. It is only for reference. The elements are stored in stack but the array name is lost after compilation unless and until you dont have -g option in your debugging.