I\'ve always been told that adding an element to an array happens like this:
An empty copy of the array+1element is created and then the data from t
If you are going to be doing a lot of adding, and you will not be doing random access (such as myArray[i]
). You could consider using a linked list (LinkedList<T>
), because it will never have to "grow" like the List<T>
implementation. Keep in mind, though, that you can only really access items in a LinkedList<T>
implementation using the IEnumerable<T>
interface.
A standard array should be defined with a length, which reserves all of the memory that it needs in a contiguous block. Adding an item to the array would put it inside of the block of already reserved memory.
Arrays are great for few writes and many reads, particularly those of an iterative nature - for anything else, use one of the many other data structures.
This really depends on what you mean by "add."
If you mean:
T[] array;
int i;
T value;
...
if (i >= 0 && i <= array.Length)
array[i] = value;
Then, no, this does not create a new array, and is in-fact the fastest way to alter any kind of IList in .NET.
If, however, you're using something like ArrayList, List, Collection, etc. then calling the "Add" method may create a new array -- but they are smart about it, they don't just resize by 1 element, they grow geometrically, so if you're adding lots of values only every once in a while will it have to allocate a new array. Even then, you can use the "Capacity" property to force it to grow before hand, if you know how many elements you're adding (list.Capacity += numberOfAddedElements
)
You are correct an array is great for look ups. However modifications to the size of the array are costly.
You should use a container that supports incremental size adjustments in the scenario where you're modifying the size of the array. You could use an ArrayList which allows you to set the initial size, and you could continually check the size versus the capacity and then increment the capacity by a large chunk to limit the number of resizes.
Or you could just use a linked list. Then however look ups are slow...
When to abandon the use of arrays
First and foremost, when semantics of arrays dont match with your intent - Need a dynamically growing collection? A set which doesn't allow duplicates? A collection that has to remain immutable? Avoid arrays in all that cases. That's 99% of the cases. Just stating the obvious basic point.
Secondly, when you are not coding for absolute performance criticalness - That's about 95% of the cases. Arrays perform better marginally, especially in iteration. It almost always never matter.
When you're not forced by an argument with params
keyword - I just wished params
accepted any IEnumerable<T>
or even better a language construct itself to denote a sequence (and not a framework type).
When you are not writing legacy code, or dealing with interop
In short, its very rare that you would actually need an array. I will add as to why may one avoid it?
The biggest reason to avoid arrays imo is conceptual. Arrays are closer to implementation and farther from abstraction. Arrays conveys more how it is done than what is done which is against the spirit of high level languages. That's not surprising, considering arrays are closer to the metal, they are straight out of a special type (though internally array is a class). Not to be pedagogical, but arrays really do translate to a semantic meaning very very rarely required. The most useful and frequent semantics are that of a collections with any entries, sets with distinct items, key value maps etc with any combination of addable, readonly, immutable, order-respecting variants. Think about this, you might want an addable collection, or readonly collection with predefined items with no further modification, but how often does your logic look like "I want a dynamically addable collection but only a fixed number of them and they should be modifiable too"? Very rare I would say.
Array was designed during pre-generics era and it mimics genericity with lot of run time hacks and it will show its oddities here and there. Some of the catches I found:
Broken covariance.
string[] strings = ...
object[] objects = strings;
objects[0] = 1; //compiles, but gives a runtime exception.
Arrays can give you reference to a struct!. That's unlike anywhere else. A sample:
struct Value { public int mutable; }
var array = new[] { new Value() };
array[0].mutable = 1; //<-- compiles !
//a List<Value>[0].mutable = 1; doesnt compile since editing a copy makes no sense
print array[0].mutable // 1, expected or unexpected? confusing surely
Run time implemented methods like ICollection<T>.Contains can be different for structs and classes. It's not a big deal, but if you forget to override non generic Equals
correctly for reference types expecting generic collection to look for generic Equals
, you will get incorrect results.
public class Class : IEquatable<Class>
{
public bool Equals(Class other)
{
Console.WriteLine("generic");
return true;
}
public override bool Equals(object obj)
{
Console.WriteLine("non generic");
return true;
}
}
public struct Struct : IEquatable<Struct>
{
public bool Equals(Struct other)
{
Console.WriteLine("generic");
return true;
}
public override bool Equals(object obj)
{
Console.WriteLine("non generic");
return true;
}
}
class[].Contains(test); //prints "non generic"
struct[].Contains(test); //prints "generic"
The Length
property and []
indexer on T[]
seem to be regular properties that you can access through reflection (which should involve some magic), but when it comes to expression trees you have to spit out the exact same code the compiler does. There are ArrayLength
and ArrayIndex
methods to do that separately. One such question here. Another example:
Expression<Func<string>> e = () => new[] { "a" }[0];
//e.Body.NodeType == ExpressionType.ArrayIndex
Expression<Func<string>> e = () => new List<string>() { "a" }[0];
//e.Body.NodeType == ExpressionType.Call;
How to abandon the use of arrays
The most commonly used substitute is List<T>
which has a cleaner API. But it is a dynamically growing structure which means you can add to a List<T>
at the end or insert anywhere to any capacity. There is no substitute for the exact behaviour of an array, but people mostly use arrays as readonly collection where you can't add anything to its end. A substitute is ReadOnlyCollection<T>
. I carry this extension method:
public ReadOnlyCollection<T> ToReadOnlyCollection<T>(IEnumerable<T> source)
{
return source.ToList().AsReadOnly();
}