I\'m having a devil of a time understanding references. Consider the following code:
class Animal
{
public:
virtual void makeSound() {cout << \"rawr\"
In order to avoid slicing you have to return or pass around a pointer to the object. (Note that a reference is basically a 'permanently dereferenced pointer'.
Animal r2 = rFunc();
r2.makeSound();
Here, r2 is ting instantiated (using the compiler generated copy ctor) but it's leaving off the Dog parts. If you do it like this the slicing won't occur:
Animal& r2 = rFunc();
However your vFunc() function slices inside the method itself.
I'll also mention this function:
Animal& rFunc()
{
return *(new Dog());
}
It's weird and unsafe; you're creating a reference to a temporary unnamed variable (dereferenced Dog). It's more appropriate to return the pointer. Returning references is normally used to return member variables and so on.
Point 1: do not use references. Use pointers.
Point 2: the thing you have above is called a Taxonomy which is hierarchical classification scheme. Taxonomies are the exemplar of a kind which is utterly unsuitable for object oriented modelling. Your trivial example only works because your base Animal assumes all animals make a noise, and can't do anything else interesting.
If you try to implement a relation, such as
virtual bool Animal::eats(Animal *other)=0;
you will find you cannot do it. The thing is: Dog is not a subtype of Animal abstraction. The whole point of Taxonomies is that the classes of each level of the partition have new an interesting properties.
For example: Vertebrates have a backbone and we can ask whether it is made of cartiledge or bone.. we can't even ask that question of Invertebrates.
To fully understand, you must see that you cannot make a Dog object. After all, it's an abstraction, right? Because, there are Kelpies and Collies, and an individual Dog has to be of some species .. the classification scheme can be as deep as you like but it can never support any concrete individuals. Fido is not-a-Dog, that's just his classification tag.
To answer the second part of your question ("how do I communicate that the pointer is subject to deletion at any time") -
This is a dangerous practice, and has subtle details you will need to consider. It is racy in nature.
If the pointer can be deleted at any point in time, it is never safe to use it from another context, because even if you check "are you still valid?" every time, it may be deleted just a tiny bit after the check, but before you get to use it.
A safe way to do these things is the "weak pointer" concept - have the object be stored as a shared pointer (one level of indirection, can be released at any time), and have the returned value be a weak pointer - something that you must query before you can use, and must release after you've used it. This way as long the object is still valid, you can use it.
Pseudo code (based on invented weak and shared pointers, I'm not using Boost...) -
weak< Animal > animalWeak = getAnimalThatMayDisappear();
// ...
{
shared< Animal > animal = animalWeak.getShared();
if ( animal )
{
// 'animal' is still valid, use it.
// ...
}
else
{
// 'animal' is not valid, can't use it. It points to NULL.
// Now what?
}
}
// And at this point the shared pointer of 'animal' is implicitly released.
But this is complex and error prone, and would likely make your life harder. I'd recommend going for simpler designs if possible.
But let's say that I want a function that returns an Animal value that is really a Dog.
- Do I understand correctly that the closest that I can get is a reference?
Yes, you are correct. But I think the problem isn't so much that you don't understand references, but that you don't understand the different types of variables in C++ or how new
works in C++. In C++, variables can be an primitive data (int,float,double,etc.), an object, or a pointer/reference to a primitive and/or object. In Java, variables can only be a primitive or a reference to an object.
In C++, when you declare a variable, actual memory is allocated and associated with the variable. In Java, you have to explicitly create objects using new and explicitly assign the new object to a variable. The key point here though is that, in C++, the object and the variable you use to access are not the same thing when the variable is a pointer or reference. Animal a;
means something different from Animal *a;
which means something different from Animal &a;
. None of these have compatible types, and they are not interchangeable.
When you type, Animal a1
in C++. A new Animal
object is created. So, when you type Animal a2 = a1;
, you end up with two variables (a1
and a2
) and two Animal
objects at different location in memory. Both objects have the same value, but you can change their values independently if you want. In Java, if you typed the same exact code, you'd end up with two variables, but only one object. As long as you didn't reassign either of the variables, they would always have the same value.
- Furthermore, is it incumbent upon the one using the rFunc interface to see that the reference returned is assign an Animal&? (Or otherwise intentionally assign the reference to an Animal which, via slicing, discards polymorphism.)
When you use references and pointers, you can access an object's value without copying it to where you want to use it. That allows you to change it from outside the curly braces where you declared the object into existence. References are generally used as function parameters or to return an object's private data members without making a new copy of them. Typically, when you recieve a reference, you don't assign it to anything. Using your example, instead of assigning the reference returned by rFunc()
to a variable, one would normally type rFunc().makeSound();
.
So, yes, it is incumbent on the user of rFunc()
, if they assign the return value to anything, to assign it to a reference. You can see why. If you assign the reference returned by rFunc()
to a variable declared as Animal animal_variable
, you end up with one Animal
variable, one Animal
object, and one Dog
object. The Animal
object associated with animal_variable
is, as much as possible, a copy of the Dog
object that was returned by reference from rFunc()
. But, you can't get polymorphic behavior from animal_variable
because that variable isn't associated with a Dog
object. The Dog
object that was returned by reference still exists because you created it using new
, but it is no longer accessible--it was leaked.
- How on earth am I supposed to return a reference to a newly generated object without doing the stupid thing I did above in rFunc? (At least I've heard this is stupid.)
The problem is that you can create an object in three ways.
{ // the following expressions evaluate to ...
Animal local;
// an object that will be destroyed when control exits this block
Animal();
// an unamed object that will be destroyed immediately if not bound to a reference
new Animal();
// an unamed Animal *pointer* that can't be deleted unless it is assigned to a Animal pointer variable.
{
// doing other stuff
}
} // <- local destroyed
All new
does in C++ is create objects in memory where it won't be destroyed until you say so. But, in order to destroy it, you have to remember where it was created at in memory. You do that by creating a pointer variable,
Animal *AnimalPointer;
,
and assigning the pointer returned by new Animal()
to it,
AnimalPointer = new Animal();
.
To destroy the Animal
object when you are done with it, you have to type delete AnimalPointer;
.
(I'm ignoring your problems with dynamic memory going into references causing memory leaks... )
Your splitting problems go away when Animal is an abstract base class. That means it has at least one pure virtual method and cannot be directly instantiated. The following becomes a compiler error:
Animal a = rFunc(); // a cannot be directly instantiated
// spliting prevented by compiler!
but the compiler allows:
Animal* a = pFunc(); // polymorphism maintained!
Animal& a = rFunc(); // polymorphism maintained!
Thus the compiler saves the day!
1) If you're creating new objects, you never want to return a reference (see your own comment on #3.) You can return a pointer (possibly wrapped by std::shared_ptr
or std::auto_ptr
). (You could also return by copy, but this is incompatible with using the new
operator; it's also slightly incompatible with polymorphism.)
2) rFunc
is just wrong. Don't do that. If you used new
to create the object, then return it through an (optionally wrapped) pointer.
3) You're not supposed to. That is what pointers are for.
EDIT (responding to your update:) It's hard to picture the scenario you're describing. Would it be more accurate to say that the returned pointer may be invalid once the caller makes a call to some other (specific) method?
I'd advise against using such a model, but if you absolutely must do this, and you must enforce this in your API, then you probably need to add a level of indirection, or even two. Example: Wrap the real object in a reference-counted object which contains the real pointer. The reference-counted object's pointer is set to null
when the real object is deleted. This is ugly. (There may be better ways to do it, but they may still be ugly.)