I just realised that this program compiles and runs (gcc version 4.4.5 / Ubuntu):
#include
using namespace std;
class Test
{
public:
// copyc
I don't know the spec reference, but I do know that accessing an uninitialized pointer always results in undefined behaviour.
When I compile your code in Visual C++ I get:
test.cpp(20): warning C4700: uninitialized local variable 'b' used
If you crank your warning levels up, your compiler will probably warn you about using uninitialized stuff. UB doesn't require a diagnostic, many things that are "obviously" wrong may compile.
I have no idea how this relates to the specification, but this is how I see it:
When you do Test a(a);
it allocates space for a
on the stack. Therefore the location of a
in memory is known to the compiler at the start of main
. When the constructor is called (the memory is of course allocated before that), the correct this
pointer is passed to it because it's known.
When you do Test *b = new Test(*b);
, you need to think of it as two steps. First the object is allocated and constructed, and then the pointer to it is assigned to b
. The reason you get the message you get is that you're essentially passing in an uninitialized pointer to the constructor, and the comparing it with the actual this
pointer of the object (which will eventually get assigned to b
, but not before the constructor exits).
The first case is (perhaps) covered by 3.8/6:
before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any lvalue which refers to the original object may be used but only in limited ways. Such an lvalue refers to allocated storage (3.7.3.2), and using the properties of the lvalue which do not depend on its value is well-defined.
Since all you're using of a
(and other
, which is bound to a
) before the start of its lifetime is the address, I think you're good: read the rest of that paragraph for the detailed rules.
Beware though that 8.3.2/4 says, "A reference shall be initialized to refer to a valid object or function." There is some question (as a defect report on the standard) what "valid" means in this context, so possibly you can't bind the parameter other
to the unconstructed (and hence, "invalid"?) a
.
So, I'm uncertain what the standard actually says here - I can use an lvalue, but not bind it to a reference, perhaps, in which case a
isn't good, while passing a pointer to a
would be OK as long as it's only used in the ways permitted by 3.8/5.
In the case of b
, you're using the value before it's initialized (because you dereference it, and also because even if you got that far, &other
would be the value of b
). This clearly is not good.
As ever in C++, it compiles because it's not a breach of language constraints, and the standard doesn't explicitly require a diagnostic. Imagine the contortions the spec would have to go through in order to mandate a diagnostic when an object is invalidly used in its own initialization, and imagine the data flow analysis that a compiler might have to do to identify complex cases (it may not even be possible at compile time, if the pointer is smuggled through an externally-defined function). Easier to leave it as undefined behavior, unless anyone has any really good suggestions for new spec language ;-)
The reason this "is allowed" is because the rules say an identifiers scope starts immediately after the identifier. In the case
int i = i;
the RHS i is "after" the LHS i so i is in scope. This is not always bad:
void *p = (void*)&p; // p contains its own address
because a variable can be addressed without its value being used. In the case of the OP's copy constructor no error can be given easily, since binding a reference to a variable does not require the variable to be initialised: it is equivalent to taking the address of a variable. A legitimate constructor could be:
struct List { List *next; List(List &n) { next = &n; } };
where you see the argument is merely addressed, its value isn't used. In this case a self-reference could actually make sense: the tail of a list is given by a self-reference. Indeed, if you change the type of "next" to a reference, there's little choice since you can't easily use NULL as you might for a pointer.
As usual, the question is backwards. The question is not why an initialisation of a variable can refer to itself, the question is why it can't refer forward. [In Felix, this is possible]. In particular, for types as opposed to variables, the lack of ability to forward reference is extremely broken, since it prevents recursive types being defined other than by using incomplete types, which is enough in C, but not in C++ due to the existence of templates.
The second one where you use new
is actually easier to understand; what you're invoking there is exactly the same as:
Test *b;
b = new Test(*b);
and you're actually performing an invalid dereference. Try to add a << &other <<
to your cout
lines in the constructor, and make that
Test *b = (Test *)0xFOOD1E44BADD1E5;
to see that you're passing through whatever value a pointer on the stack has been given. If not explicitly initialized, that's undefined. But even if you don't initialize it with some sort of (in)sane default, it'll be different from the return value of new
, as you found out.
For the first, think of it as an in-place new
. Test a
is a local variable not a pointer, it lives on the stack and therefore its memory location is always well defined - this is very much unlike a pointer, Test *b
which, unless explicitly initialized to some valid location, will be dangling.
If you write your first instantiation like:
Test a(*(&a));
it becomes clearer what you're invoking there.
I don't know a way to make the compiler disallow (or even warn) about this sort of self-initialization-from-nowhere through the copy constructor.