I do not quite understand the difference between a C# reference and a pointer. They both point to a place in memory don\'t they? The only difference I can figure out is that
One of the biggest benefits of references over pointers is greater simplicity and readability. As always when you simplify something you make it easier to use but at the cost of flexibility and control you get with the low-level stuff (as other people have mentioned).
Pointers are often criticized for being 'ugly'.
class* myClass = new class();
Now everytime you use it you have to dereference it first either by
myClass->Method() or (*myClass).Method()
Despite losing some readability and adding complexity, people still needed to use pointers often as parameters so you could modify the actual object (instead of passing by value) and for the performance gain of not having to copy huge objects.
To me this is why references were 'born' in the first place to provide the same benefit as pointers but without all that pointer syntax. Now you can pass the actual object (not just its value) AND you have a more readable, normal way of interacting with the object.
MyMethod(&type parameter)
{
parameter.DoThis()
parameter.DoThat()
}
C++ references differed from C# / Java references in that once you assign a value to it that was it, you couldn't re-assign it (and it has to be assigned when it was declared). This was the same as using a const pointer (a pointer that could not be re-pointed to another object).
Java and C# are very high level, modern languages which cleaned up a lot of the messes that had accumulated in C / C++ over the years and pointers was definitely one of those things that needed to be 'cleaned up'.
As far as your comment about knowing pointers makes you a stronger programmer, this is true in most cases. If you know 'how' something works as opposed to just using it without knowing I would say this can often give you an edge. How much of an edge will always vary. After all, using something without knowing how it is implemented is one of the many beauties of OOP and Interfaces.
In this specific example, what would knowing about pointers help you with references? Understanding that a C# reference is NOT the object itself but points to the object is a very important concept.
#1: You are NOT passing by value Well for starters when you use a pointer you know that the pointer holds just an address, that's it. The variable itself is almost empty and that's why it's so nice to pass as arguments. In addition to the performance gain, you are working with the actual object so any changes you make are not temporary
#2: Polymorphism / Interfaces When you have a reference that is an interface type and it points to an object, you can only call methods of that interface even though the object may have many more abilities. The objects may also implement the same methods differently.
If you understand these concepts well then I don't think you are missing too much from not having used pointers. C++ is often used as a language for learning programming because it is good to get your hands dirty sometimes. Also, working with lower-level aspects makes you appreciate the comforts of a modern language. I started with C++ and am now a C# programmer and I do feel like working with raw pointers have helped me have a better understanding on what goes on under the hood.
I don't think it is necessary for everyone to start with pointers, but what is important is that they understand why references are used instead of value-types and the best way to understand that is to look at its ancestor, the pointer.
First I think you need to define a "Pointer" in your sematics. Do you mean the pointer you can create in unsafe code with fixed? Do you mean an IntPtr that you get from maybe a native call or Marshal.AllocHGlobal? Do you mean a GCHandle? The all are essentially the same thing - a representation of a memory address where something is stored - be it a class, a number, a struct, whatever. And for the record, they certainly can be on the heap.
A pointer (all of the above versions) is a fixed item. The GC has no idea what is at that address, and therefore has no ability to manage the memory or life of the object. That means you lose all of the benefits of a garbage collected system. You must manually manage the object memory and you have the potential for leaks.
A reference on the other hand is pretty much a "managed pointer" that the GC knows about. It's still an address of an object, but now the GC knows details of the target, so it can move it around, do compactions, finalize, dispose and all of the other nice stuff a managed environment does.
The major difference, really, is in how and why you would use them. For a vast majority of cases in a managed language, you're going to use an object reference. Pointers become handy for doing interop and the rare need for really fast work.
Edit: In fact here's a good example of when you might use a "pointer" in managed code - in this case it's a GCHandle, but the exact same thing could have been done with AllocHGlobal or by using fixed on a byte array or struct. I tend to prefer the GCHandle becasue it feels more ".NET" to me.
A major difference between a reference and a pointer is that a pointer is a collection of bits whose content only matters when it is actively being used as a pointer, while a reference encapsulates not only a set of bits, but also some metadata which keeps the underlying framework informed of its existence. If a pointer exists to some object in memory, and that object is deleted but the pointer is not erased, the pointer's continued existence won't cause any harm unless or until an attempt is made to access the memory to which it points. If no attempt is made to use the pointer, nothing will care about its existence. By contrast, reference-based frameworks like .NET or the JVM require that it always be possible for the system to identify every object reference in existence, and every object reference in existence must always either be null
or else identify an object of its proper type.
Note that each object reference actually encapsulates two kinds of information: (1) the field contents of the object it identifies, and (2) the set of other references to the same object. Although there isn't any mechanism by which the system can quickly identify all the references that exist to an object, the set of other references that exist to an object may often be the most important thing encapsulated by a reference (this is especially true when things of type Object
are used as things like lock tokens). Although the system keeps a few bits of data for each object for use in GetHashCode
, objects have no real identity beyond the set of references that exist to them. If X
holds the only extant reference to an object, replacing X
with a reference to a new object with the same field contents will have no identifiable effect except to change the bits returned by GetHashCode()
, and even that effect isn't guaranteed.
Pointers point to a location in the memory address space. References point to a data structure. Data structures all moved all the time (well, not that often, but every now and then) by the garbage collector (for compacting memory space). Also, as you said, data structures without references will get garbage collected after a while.
Also, pointers are only usable in unsafe context.
C# references can, and will be relocated by garbage collector but normal pointers are static. This is why we use fixed
keyword when acquiring a pointer to an array element, to prevent it from getting moved.
EDIT: Conceptually, yes. They are more or less the same.
The thing about pointers that makes them somewhat complex is not what they are, but what you can do with them. And when you have a pointer to a pointer to a pointer. That's when it really starts to get fun.