My background is C++ and I\'m currently about to start developing in C# so am doing some research. However, in the process I came across something that raised a question abo
There is a lot of bad writing out there. For example the statement:
In C++ an array is merely a pointer.
is simply false. How can such bad writing come about? We can only speculate, but one possible theory is that the author learned C++ by trial and error using a compiler, and formed a faulty mental model of C++ based on the results of his experiments. This is possibly because the syntax used by C++ for arrays is unconventional.
The next question is, how can a learner know if he/she is reading good material or bad material? Other than by reading my posts of course ;-) , participating in communities like Stack Overflow helps to bring you into contact with a lot of different presentations and descriptions, and then after a while you have enough information and experience to make your own decisions about which writing is good and which is bad.
Moving back to the array/pointer topic: my advice would be to first build up a correct mental model of how object storage works when we are working in C++. It's probably too much to write about just for this post, but here is how I would build up to it from scratch:
int x;
it is decided that a particular block of adjacent bytes is set aside to store an integer value. An object is any region of allocated storage. (Yes this is a slightly circular definition!)x
only exists during the compilation stage of a program. At runtime there can be int
objects allocated that never had a name; and there can be other int
objects with one or more names during compilation. int
From hereon in, C++ syntax comes into it. C++'s type system uses strong typing which means that each object has a type. The type system extends to pointers. In almost all situations, the storage used to store a pointer only saves the address of the first byte of the object being pointed to; and the type system is used at compilation time to keep track of what is being pointed to. This is why we have different types of pointer (e.g. int *
, float *
) despite the fact that the storage may consist of the same sort of address in both cases.
Finally: the so-called "array-pointer equivalence" is not an equivalence of storage, if you understood my last two bullet points. It's an equivalence of syntax for looking up members of an array.
Since we know that a pointer can be used to find another object; and an array is a series of many adjacent objects; then we can work with the array by working with a pointer to that array's first element. The equivalence is that the same processing can be used for both of the following:
and furthermore, those concepts can be both expressed using the same syntax.
it's as simple as:
int arr[10];
int* arr_pointer1 = arr;
int* arr_pointer2 = &arr[0];
so, since arrays are contiguous in memory, writing
arr[1];
is the same as writing:
*(arr_pointer+1)
pushing things a bit further, writing:
arr[2];
//resolves to
*(arr+2);
//note also that this is perfectly valid
2[arr];
//resolves to
*(2+arr);
The meaning in this guide was simply that in C# an array is an object (perhaps like in STL that we can use in C++), while in C++ an array is basically a sequence of variables located & allocated one after the other, and that's why we can refer to them using a pointer (pointer++
will give us the next one etc.).
They are most definitely not the same thing at all, but in this case, confusion can be forgiven because the language semantics are ... flexible and intended for the maximum confusion.
Let's start by simply defining a pointer and an array.
A pointer (to a type T) points to a memory space which holds at least one T (assuming non-null).
An array is a memory space that holds multiple Ts.
A pointer points to memory, and an array is memory, so you can point inside or to an array. Since you can do this, pointers offer many array-like operations. Essentially, you can index any pointer on the presumption that it actually points to memory for more than one T.
Therefore, there's some semantic overlap between (pointer to) "Memory space for some Ts" and "Points to a memory space for some Ts". This is true in any language- including C#. The main difference is that they don't allow you to simply assume that your T reference actually refers to a space where more than one T lives, whereas C++ will allow you to do that.
Since all pointers to a T can be pointers to an array of T of arbitrary size, you can treat pointers to an array and pointers to a T interchangably. The special case of a pointer to the first element is that the "some Ts" for the pointer and "some Ts" for the array are equal. That is, a pointer to the first element yields a pointer to N Ts (for an array of size N) and a pointer to the array yields ... a pointer to N Ts, where N is equal.
Normally, this is just interesting memory crapping-around that nobody sane would try to do. But the language actively encourages it by converting the array to the pointer to the first element at every opportunity, and in some cases where you ask for an array, it actually gives you a pointer instead. This is most confusing when you want to actually use the array like a value, for example, to assign to it or pass it around by value, when the language insists that you treat it as a pointer value.
Ultimately, all you really need to know about C++ (and C) native arrays is, don't use them, pointers to arrays have some symmetries with pointers to values at the most fundamental "memory as an array of bytes" kind of level, and the language exposes this in the most confusing, unintuitive and inconsistent way imaginable. So unless you're hot on learning implementation details nobody should have to know, then use std::array
, which behaves in a totally consistent, very sane way and just like every other type in C++. C# gets this right by simply not exposing this symmetry to you (because nobody needs to use it, give or take).
Arrays and pointers in C and C++ can be used with the exact same semantics and syntax in the vast majority of cases.
That is achieved by one feature:
Arrays decay to pointers to their first element in nearly all contexts.
Exceptions in C: sizeof
, _Alignas
, _Alignas
, address-of &
In C++, the difference can also be important for overload-resolution.
In addition, array notation for function arguments is deceptive, these function-declarations are equivalent:
int f(int* a);
int f(int a[]);
int f(int a[3]);
But not to this one:
int f(int (&a)[3]);
Besides what has already been told, there is one big difference: pointers are variables to store memory addresses, and they can be incremented or decremented and the values they store can change (they can point to any other memory location). That's not the same for arrays; once they are allocated, you can't change the memory region they reference, e.g. you cannot assign other values to them:
int my_array[10];
int x = 2;
my_array = &x;
my_array++;
Whereas you can do the same with a pointer:
int *p = array;
p++;
p = &x;