I am curious as to what part of the dereferencing a NULL ptr causes undesired behavior. Example:
// #1
someObj * a;
a = NULL;
(*a).somefunc(); // crash, dere
That depends on the declaration of anotherfunc()
someObj * b;
anotherObj * c;
b = NULL;
c->anotherfunc(*b);
If anotherfunc() accepts a reference to b then you have not de-referenceed b, you have just converted it into a reference. If on the other hand it is a value parameter then a copy constructor will be invoked and then you have de-referenced it.
Weather it will crash will depend on many factors (like if it has members). But the act of de-referencing on a NULL is undefined so it has the option of working on your compiler.
As for the first option of calling a method on a NULL pointer.
This also is undefined behavior. Weather it crashes will depend on the compiler and OS. But it is perfectly valid to not crash (the behavior is undefined).
A lot of confusion is derived because people refer to the * in *b as de-reference operator. This may be its common name but in the standard it is the 'unary * operator' and it is defined as:
5.3.1
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.
So the 'unary * operator' returns a reference to the object that was pointed at by the pointer it was applied to. (No de-referencing has happened at this point).
Although in the standards dereferencing a zero pointer (NULL) is undefined behavior, current processors and operating systems generate a segmentation fault or similar error.
Maybe that function you called accepts a reference parameter (which IS a pointer) and that function doesn't use the paramenter, so the NULL won't be dereferenced.
In the early days, programmers were spending lot of time tracing down memory corruption bugs. One day a light bulb light up in some smart programmer's head. He said "What if I make it illegal to access the first page of memory and point all invalid pointers to it?" Once that happened, most memory corruption bugs were quickly found.
That's the history behind null pointer. I heard the story so many years ago, I can't recall any detail now, but I'm sure someone how's older...I mean wiser can tell us more about it.
NULL is just 0. Since 0 doesn't point to a real memory address, you can't dereference it. *b can't just resolve to NULL, since NULL is something that applies to pointers, not objects.
There is a concept, in the standard, of a null pointer value. This is a distinct value that causes undefined behavior when the program attempts to access memory through it. In practice, lots of modern implementations have it crash the program, which is useful behavior. After all, such an attempt is a mistake.
The name of the null pointer value is 0
, or any other constant integral expression in pointer context (like 3 - 3
, for example). There is also a NULL
macro, which has to evaluate to 0 in C++ but can be (void *)0
in C (C++ insists more on pointers being type-safe). In C++0x, there will be an explicit value called nullptr
, finally giving the null pointer an explicit name.
The value of the null pointer doesn't have to be an actual zero, although it is on all implementations I'm aware of, and the odd computers where that didn't work have mostly been retired.
You're misstating what happens in your last example. *b
doesn't resolve into anything. Passing *b
is undefined behavior, which means the implementation can do anything it likes with it. It may or may not be flagged as an error, and may or may not cause problems. The behavior can change for no apparent reason, and so doing this is a mistake.
If a called function is expecting a pointer value, passing it a null pointer value is perfectly legitimate, and the called function should handle it properly. Dereferencing a null pointer value is never legitimate.
Whether or not the mere fact of dereferencing a null pointer already results in undefined behavior is currently a gray zone in the Standard, unfortunately. What is certain is that reading a value out of the result of dereferencing a pointer is undefined behavior.
That it is undefined behavior is stated by various notes throughout the Standard. But notes are not normative: They could say anything, but they will never be able to state any rules. Their purpose is entirely informative.
That calling a member function on a null pointer formally is undefined behavior too.
The formal problem with merely dereferencing a null pointer is that determining the identity of the resulting lvalue expression is not possible: Each such expression that results from dereferencing a pointer must unambiguously refer to an object or a function when that expression is evaluated. If you dereference a null pointer, you don't have an object or function that this lvalue identifies. This is the argument the Standard uses to forbid null-references.
Another problem that adds to the confusion is that the semantics of the typeid
operator make part of this misery well defined. It says that if it was given an lvalue that resulted from dereferencing a null pointer, the result is throwing a bad_typeid
exception. Although, this is a limited area where there exist an exception (no pun) to the above problem of finding an identity. Other cases exist where similar exception to undefined behavior is made (although much less subtle and with a reference on the affected sections).
The committee discussed to solve this problem globally, by defining a kind of lvalue that does not have an object or function identity: The so called empty lvalue. That concept, however, still had problems, and they decided not to adopt it.
Now, practically, you will not encounter a crash when merely dereferencing a null pointer. The problem of identifying an object or function for an lvalue seems to be entirely language theoretical. What is problematic is when you try to read a value out of the result of dereference. The following case will almost certainly crash, because it tries to read an integer from an address which is most probably not mapped by the affected process
int a = *(int*)0;
There are few cases where reading out of such an expression probably won't cause a crash. One is when you dereference an array pointer:
int *pa = *(int(*)[1])0;
Since reading from an array just returns its address using a element pointer type, this will most probably just make a null pointer (but as you dereference a null pointer before, this still is undefined behavior formally). Another case is dereferencing of function null pointers. Here too, reading a function lvalue just give you its address but using a function pointer type:
void(*pf)() = *(void(*)())0;
Aswell as the other cases, this is undefined behavior too, of course, but will probably not result in a crash.
Like the above cases, just calling a non-virtual member function on a null pointer isn't practically problematic either, most probably - even though it formally is undefined behavior. Calling the function will jump to the functions address, and don't need to read any data. As soon as you would try to read a nonstatic data-member, the same problem occurs as when reading out of a normal null pointer. Some people place an
assert(this != NULL);
In front of some member function bodies in case they accidentally called a function on a null pointer. This may be a good idea when there are often cases where such functions are mistakenly called on null pointers, to catch errors early. But from a formal point of view, this
can never be a null pointer in a member function.