I have question regarding how memory is managed for strong type Generics
List ints1 = new List();
ints1.Add(1); ints1.Add(2); ints1.Add
Lists in C# internally contain arrays. List refers to a location on the heap, and in that location is an array storing all the values. So the values are stored on the heap. Same goes for arrays if it's part of a class.
The stack is continuous, so pushing another int on the stack will mean that it's memory address is the location of the previous int + 4. The way Lists work when you adding items is that they create an array larger than what you need. When you reach the length of the array, there's an algorithm that creates a larger array and copies the current values over.
Another thing that may interest you are linked lists. Linked lists don't work with arrays internally, instead they work with nodes. Each node contains data and the location of the next node in the list. A doubly linked list contains nodes with all that and the location of the previous node in the list.
The "new" syntax is used to initialize both value-types and reference-types. The new list is created on the heap; the values are loaded on the stack (i.e. before they are added to the list), but once added, they are on the heap, in the int[]
that underpins the list. Arrays are always on the heap.
The fact that they are copied to the array also answers part 2 I believe. The array is over-sized, and reallocated only when full.
Note; List<int>
doesn't "become" a reference-type; it is always a reference-type.
Memory management for generics (Generic collections) is exactly the same as for non-generic types.
Your ints1
list uses an array under the covers. So it is the same as for ints2
(when it has been corrected). In both cases a block of memory on the Heap is holding the int
values.
The List<>
class consists of an array, a int Count
and an int Capacity
property. When you Add() an element Count
is incremented, when it passes Capacity
a new array is allocated and the contents are copied.
I presume as ints1 is initialised with a new keyword
new List<int>()
it becomes a reference type
This presumption is incorrect. You can use the "new" keyword on a value type too!
int x = new int();
Using "new" does not make anything a reference type. You can use "new" with reference types or value types. What "new" indicates is that storage is going to be allocated and a constructor is going to be called.
In the case of using "new" on a value type, the allocated storage is temporary storage. A reference to that temporary storage is passed to the constructor, and then the now-initialized result is copied to its final destination, if there is one. ("new" is usually used with an assignment but it need not be.)
In the case of a reference type, storage is allocated twice: long-term storage is allocated for the instance and short-term storage is allocated for the reference to long-term storage of the instance. The reference is passed to the constructor, which initializes the long-term storage. The reference is then copied from short-term storage to its final destination, if there is one.
What makes List<int>
a reference type is that List<T>
is declared as a class.
Where are the values 1,2,3 are stored in memory (are they stored in stack or on heap)?
We've worked hard to make a memory manager that lets you not care where things are stored. Values are stored in either a short-term memory pool (implemented as the stack or registers) or a long-term memory pool (implemented as a garbage-collected heap). Storage is allocated depending on the known lifetime of the value. If the value is known to be short-lived then its storage is allocated on the short-term pool. If the value is not known to be short-lived then it must be allocated on the long-term pool.
The 1, 2, 3 owned by the list could live forever; we do not know whether that list is going to outlive the current activation frame or not. Therefore the memory to store the 1, 2, 3 is allocated on the long-term pool.
Do not believe the lie that "value types are always allocated on the stack". Obviously that cannot be true because then a class or array containing a number could not survive the current stack frame! Value types are allocated on the pool that makes sense for their known lifetime.
List<int>
can scale its size to any size at runtime unlikeint[]
Correct. It is educational to see how List<T>
does that. It simply allocates an array of T larger than it needs. If it discovers that it guessed too small, it allocates a new, larger array and copies the old array contents to the new one. A List<T>
is just a convenient wrapper around a bunch of array copies!
if the values 1,2,3 were stored in stack, and a new item 4 is added to the list, then it wouldn't be continuous to the first three.
Correct. That's one reason why the storage for values 1, 2, 3 are not allocated on the stack. The storage is actually an array allocated on the heap.
so how will the list know the memory location of item 4?
The list allocates an array that is too big. When you add a new item, it sticks it into unused space in the too-big array. When the array runs out of room, it allocates a new array.
Question 1: http://msdn.microsoft.com/en-us/library/6sh2ey19.aspx says:
The List class is the generic equivalent of the ArrayList class. It implements the IList generic interface using an array whose size is dynamically increased as required.
This looks like a simple array, that is just reallocated if it overflows. AFAIKR the size is doubled on every reallocation - I researched that once, but can't remember what for.
The array is allocated on the managed heap, just as it would if you just declared it.
List is a reference type no matter how you see it. All these types are allocated on the heap. I do not know whether the C# compiler is clever enough yet to figure out that an object which is not used outside of a method can be allocated on the stack, (Eric Lippert might be able to tell us,) but even if it does, that's something that you, as a programmer, do not need to worry about. It will just be an optimization that the compiler will do for you, without you ever noticing.
An array of int is also a reference type and it is also allocated on the heap, it is just as simple as that. There is no point in wondering about some hypothetical fragmentation of arrays in the stack, because they are simply not allocated in the stack.