I am moving towards C++11 from C++98 and have become familiar with the auto
keyword. I was wondering why we need to explicitly declare auto
if the
syntax has to be unambiguous and also backward compatible.
If auto is dropped there will be no way to distinguish between statements and definitions.
auto n = 0; // fine
n=0; // statememt, n is undefined.
was it not possible to achieve the same outcome without explicitly declaring a variable
auto
?
I am going to rephrase your question slightly in a way that will help you understand why you need auto
:
Was it not possible to achieve the same outcome without explicitly using a type placeholder?
Was it not possible? Of course it was "possible". The question is whether it would be worth the effort to do it.
Most syntaxes in other languages that do not typenames work in one of two ways. There's the Go-like way, where name := value;
declares a variable. And there's the Python-like way, where name = value;
declares a new variable if name
has not previously been declared.
Let's assume that there are no syntactic issues with applying either syntax to C++ (even though I can already see that identifier
followed by :
in C++ means "make a label"). So, what do you lose compared to placeholders?
Well, I can no longer do this:
auto &name = get<0>(some_tuple);
See, auto
always means "value". If you want to get a reference, you need to explicitly use a &
. And it will rightly fail to compile if the assignment expression is a prvalue. Neither of the assignment-based syntaxes has a way to differentiate between references and values.
Now, you could make such assignment syntaxes deduce references if the given value is a reference. But that would mean that you can't do:
auto name = get<0>(some_tuple);
This copies from the tuple, creating an object independent of some_tuple
. Sometimes, that's exactly what you want. This is even more useful if you want to move from the tuple with auto name = get<0>(std::move(some_tuple));
.
OK, so maybe we could extend these syntaxes a bit to account for this distinction. Maybe &name := value;
or &name = value;
would mean to deduce a reference like auto&
.
OK, fine. What about this:
decltype(auto) name = some_thing();
Oh that's right; C++ actually has two placeholders: auto and decltype(auto). The basic idea of this deduction is that it works exactly as if you had done decltype(expr) name = expr;
. So in our case, if some_thing()
is an object, it will deduce an object. If some_thing()
is a reference, it will deduce a reference.
This is very useful when you're working in template code and are not sure exactly what the return value of a function will be. This is great for forwarding, and it is an essential tool, even if it is not widely used.
So now we need to add more to our syntax. name ::= value;
means "do what decltype(auto)
does". I don't have an equivalent for the Pythonic variant.
Looking at this syntax, isn't that rather easy to accidentally mis-type? Not only that, it's hardly self-documenting. Even if you've never seen decltype(auto)
before, it's big and obvious enough that you can at least easily tell that there's something special going on. Whereas the visual difference between ::=
and :=
is minimal.
But that's opinion stuff; there are more substantive issues. See, all of this is based on using assignment syntax. Well... what about places where you can't use assignment syntax? Like this:
for(auto &x : container)
Do we change that to for(&x := container)
? Because that seems to be saying something very different from range-based for
. It looks like it's the initializer statement from a regular for
loop, not a range-based for
. It would also be a different syntax from non-deduced cases.
Also, copy-initialization (using =
) is not the same thing in C++ as direct-initialization (using constructor syntax). So name := value;
may not work in cases where auto name(value)
would have.
Sure, you could declare that :=
will use direct-initialization, but that would be quite in-congruent with the way the rest of C++ behaves.
Also, there's one more thing: C++14. It gave us one useful deduction feature: return type deduction. But this is based on placeholders. So much like range-based for
, it is fundamentally based on a typename that gets filled in by the compiler, not by some syntax applied to a particular name and expression.
See, all of these problems come from the same source: you're inventing entirely new syntax for declaring variables. Placeholder-based declarations didn't have to invent new syntax. They're using the exact same syntax as before; they're just employing a new keyword that acts like a type, but has a special meaning. This is what allows it to work in range-based for
and for return type deduction. It is what allows it to have multiple forms (auto
vs. decltype(auto)
). And so forth.
Placeholders work because they are the simplest solution to the problem, while simultaneously retaining all of the benefits and generality of using an actual type name. If you came up with another alternative that worked as universally as placeholders do, it is highly unlikely that it would be as simple as placeholders.
Unless it was just spelling placeholders with different keywords or symbols...
auto
could be dropped in some cases, but that would lead to inconsistency.First of all, as pointed, the declaration syntax in C++ is <type> <varname>
. Explicit declarations require some type or at least a declaration keyword in its place. So we could use var <varname>
or declare <varname>
or something, but auto
is a long standing keyword in C++, and is a good candidate for automatic type deduction keyword.
Is it possible to implicitly declare variables by assignment without breaking everything?
Sometimes yes. You can't perform assignment outside functions, so you could use assignment syntax for declarations there. But such approach would bring inconsistency to the language, possibly leading to human errors.
a = 0; // Error. Could be parsed as auto declaration instead.
int main() {
return 0;
}
And when it comes to any kind of local variables explicit declarations are they way of controlling the scope of a variable.
a = 1; // use a variable declared before or outside
auto b = 2; // declare a variable here
If ambiguous syntax was allowed, declaring global variables could suddenly convert local implicit declarations to assignments. Finding those conversions would require checking everything. And to avoid collisions you would need unique names for all globals, which kind of destroys the whole idea of scoping. So it's really bad.
Adding to previous answers, one extra note from an old fart: It looks like you may see it as an advantage to be able to just start using a new variable without in any way declaring it.
In languages with the possibility of implicit definition of variables this can be a big problem, especially in larger systems. You make one typo and you debug for hours only to find out you unintentionally introduced a variable with a value of zero (or worse) - blue
vs bleu
, label
vs lable
... the result is you can't really trust any code without thorough checking on precise variable names.
Just using auto
tells both compiler and maintainer that it is your intention to declare a new variable.
Think about it, to be able to avoid this sort of nightmares the 'implicit none' statement was introduced in FORTRAN - and you see it used in all serious FORTRAN programs nowadays. Not having it is simply ... scary.
Dropping the explicit auto
would break the language:
e.g.
int main()
{
int n;
{
auto n = 0; // this shadows the outer n.
}
}
where you can see that dropping the auto
would not shadow the outer n
.
Your question allows two interpretations:
Bathsheba answered nicely the first interpretation, for the second, consider the following (assuming no other declarations exist so far; hypothetically valid C++):
int f();
double g();
n = f(); // declares a new variable, type is int;
d = g(); // another new variable, type is double
if(n == d)
{
n = 7; // reassigns n
auto d = 2.0; // new d, shadowing the outer one
}
It would be possible, other languages get away quite well with (well, apart from the shadowing issue perhaps)... It is not so in C++, though, and the question (in the sense of the second interpretation) now is: Why?
This time, the answer is not as evident as in the first interpretation. One thing is obvious, though: The explicit requirement for the keyword makes the language safer (I do not know if this is what drove the language committee to its decision, still it remains a point):
grummel = f();
// ...
if(true)
{
brummel = f();
//^ uh, oh, a typo...
}
Can we agree on this not needing any further explanations?
The even bigger danger in not requiring auto, [however], is that it means that adding a global variable in a place far away from a function (e.g. in a header file) can turn what was intended to be the declaration of a locally-scoped variable in that function into an assignment to the global variable... with potentially disastrous (and certainly very confusing) consequences.
(cited psmears' comment due to its importance - thanks for hinting to)