I am new to C#, from a C++ background. In C++ you can do this:
class MyClass{
....
};
int main()
{
MyClass object;
In C#, class
type objects are always allocated on the heap, i.e. variables of such types are always references ("pointers"). Just declaring a variable of such a type does not cause the allocation of an object. Allocating a class
object on the stack like it's common to do in C++ isn't (in general) an option in C#.
Local variables of any type that have not been assigned to are considered uninitialized, and they cannot be read until they have been assigned to. This is a design choice (another way would have been to assign default(T)
to every variable at declaration time) which seems like a good idea because it should protect you from some programming errors.
It's similar to how in C++ it wouldn't make sense to say SomeClass *object;
and never assign anything to it.
Because in C# all class
type variables are pointers, allocating an empty object when the variable is declared would lead to inefficient code when you actually only want to assign a value to the variable later, for instance in situations like this:
// Needs to be declared here to be available outside of `try`
Foo f;
try { f = GetFoo(); }
catch (SomeException) { return null; }
f.Bar();
Or
Foo f;
if (bar)
f = GetFoo();
else
f = GetDifferentFoo();
When you use referenced types then in this statement
Car c = new Car();
there are created two entities: a reference named c
to an object of type Car in the stack and the object of type Car itself in the heap.
If you will just write
Car c;
then you create an uninitialized reference (provided that c
is a local variable) that points to nowhere.
In fact it is equivalent to C++ code where instead of references there are used pointers.
For example
Car *c = new Car();
or just
Car *c;
The difference between C++ and C# is that C++ can create instances of classes in the stack like
Car c;
In C# this means creating a reference of type Car that as I said points nowhere.
In C# you can do the similar thing:
// please notice "struct"
struct MyStruct {
....
}
MyStruct sample1; // this will create object on stack
MyStruct sample2 = new MyStruct(); // this does the same thing
Recall that primitives like int
, double
, and bool
are also of type struct
, so even though it's conventional to write
int i;
we may also write
int i = new int();
unlike C++, C# doesn't use pointers (in the safe mode) to instances,
however C# has class
and struct
declarations:
class
: you have reference to instance,
memory is allocated on heap,
new
is mandatory; similar to MyClass*
in C++
struct
: you have value,
memory is (usually) allocated on stack,
new
is optional; similar to MyClass
in C++
In your particular case you can just turn Car
into struct
struct Car
{
public int i;
public int j;
}
and so the fragment
Car x; // since Car is struct, new is optional now
x.i = 2;
x.j = 3;
will be correct
From the microsoft programming guide:
At run time, when you declare a variable of a reference type, the variable contains the value null until you explicitly create an instance of the object by using the new operator, or assign it an object that has been created elsewhere by using new
A class is a reference type. When an object of the class is created, the variable to which the object is assigned holds only a reference to that memory. When the object reference is assigned to a new variable, the new variable refers to the original object. Changes made through one variable are reflected in the other variable because they both refer to the same data.
A struct is a value type. When a struct is created, the variable to which the struct is assigned holds the struct's actual data. When the struct is assigned to a new variable, it is copied. The new variable and the original variable therefore contain two separate copies of the same data. Changes made to one copy do not affect the other copy.
I think in your C# example your effectively trying to assign values to a null pointer. In c++ translation this would look like:
Car* x = null;
x->i = 2;
x->j = 3;
This would obviously compile but crash.
There are a lot of misconceptions here, both in the question itself and in the several answers.
Let me begin by examining the premise of the question. The question is "why do we need the new
keyword in C#?" The motivation for the question is this fragment of C++:
MyClass object; // this will create object in memory
MyClass* object = new MyClass(); // this does same thing
I criticize this question on two grounds.
First, these do not do the same thing in C++, so the question is based on a faulty understanding of the C++ language. It is very important to understand the difference between these two things in C++, so if you do not understand very clearly what the difference is, find a mentor who can teach you how to know what the difference is, and when to use each.
Second, the question presupposes -- incorrectly -- that those two syntaxes do the same thing in C++, and then, oddly, asks "why do we need new
in C#?" Surely the right question to ask given this -- again, false -- presupposition is "why do we need new
in C++?" If those two syntaxes do the same thing -- which they do not -- then why have two syntaxes in the first place?
So the question is both based on a false premise, and the question about C# does not actually follow from the -- misunderstood -- design of C++.
This is a mess. Let's throw out this question and ask some better questions. And let's ask the question about C# qua C#, and not in the context of the design decisions of C++.
What does the
new X
operator do in C#, where X is a class or struct type? (Let's ignore delegates and arrays for the purposes of this discussion.)
The new operator:
All right, I can already hear the objections from C# programmers out there, so let's dismiss them.
Objection: no new storage is allocated if the type is a value type, I hear you say. Well, the C# specification disagrees with you. When you say
S s = new S(123);
for some struct type S
, the spec says that new temporary storage is allocated on the short-term pool, initialized to its default values, the constructor runs with this
set to refer to the temp storage, and then the resulting object is copied to s
. However, the compiler is permitted to use a copy-elision optimization provided that it can prove that it is impossible for the optimization to become observed in a safe program. (Exercise: work out under what circumstances a copy elision cannot be performed; give an example of a program that would have different behaviours if elision was or was not used.)
Objection: a valid instance of a value type can be produced using default(S)
; no constructor is called, I hear you say. That's correct. I didn't say that new
is the only way to create an instance of a value type.
In fact, for a value type new S()
and default(S)
are the same thing.
Objection: Is a constructor really executed for situations like new S()
, if not present in the source code in C# 6, I hear you say. This is an "if a tree falls in the forest and no one hears it, does it make a sound?" question. Is there a difference between a call to a constructor that does nothing, and no call at all? This is not an interesting question. The compiler is free to elide calls that it knows do nothing.
Suppose we have a variable of value type. Must we initialize the variable with an instance produced by
new
?
No. Variables which are automatically initialized, such as fields and array elements, will be initialized to the default value -- that is, the value of the struct where all the fields are themselves their default values.
Formal parameters will be initialized with the argument, obviously.
Local variables of value type are required to be definitely assigned with something before the fields are read, but it need not be a new
expression.
So effectively, variables of value type are automatically initialized with the equivalent of
default(S)
, unless they are locals?
Yes.
Why not do the same for locals?
Use of an uninitialized local is strongly associated with buggy code. The C# language disallows this because doing so finds bugs.
Suppose we have a variable of reference type. Must we initialize
S
with an instance produced bynew
?
No. Automatic-initialization variables will be initialized with null. Locals can be initialized with any reference, including null
, and must be definitely assigned before being read.
So effectively, variables of reference type are automatically initialized with
null
, unless they are locals?
Yes.
Why not do the same for locals?
Same reason. A likely bug.
Why not automatically initialize variables of reference type by calling the default constructor automatically? That is, why not make
R r;
the same asR r = new R();
?
Well, first of all, many types do not have a default constructor, or for that matter, any accessible constructor at all. Second, it seems weird to have one rule for an uninitialized local or field, another rule for a formal, and yet another rule for an array element. Third, the existing rule is very simple: a variable must be initialized to a value; that value can be anything you like; why is the assumption that a new instance is desired warranted? It would be bizarre if this
R r;
if (x) r = M(); else r = N();
caused a constructor to run to initialize r
.
Leaving aside the semantics of the
new
operator, why is it necessary syntactically to have such an operator?
It's not. There are any number of alternative syntaxes that could be grammatical. The most obvious would be to simply eliminate the new
entirely. If we have a class C
with a constructor C(int)
then we could simply say C(123)
instead of new C(123)
. Or we could use a syntax like C.construct(123)
or some such thing. There are any number of ways to do this without the new
operator.
So why have it?
First, C# was designed to be immediately familiar to users of C++, Java, JavaScript, and other languages that use new
to indicate new storage is being initialized for an object.
Second, the right level of syntactic redundancy is highly desirable. Object creation is special; we wish to call out when it happens with its own operator.
ignoring the stack vs heap side of things:
because C# made the bad decision to copy C++ when they should have just made the syntax
Car car = Car()
(or something similar). Having 'new' is superfluous.