I\'m writing a game that has a huge 2D array of \"cells\". A cell takes only 3 bytes. I also have a class called CellMap, which contains the 2D array as a private field, and pro
Eric Lippert's approach is good, but I would suggest using a base class rather than an interface for the indirect accessor. The following program demonstrates a class which acts like a sparse array of points. Provided that one never persists any item of type PointRef(*), things should work beautifully. Saying:
MyPointHolder(123) = somePoint
or
MyPointHolder(123).thePoint = somePoint
will both create a temporary pointRef object (a pointRef.onePoint in one case; a pointHolder.IndexedPointRef in the other) but the widening typecasts work to maintain value semantics. Of course, things would have been much easier if (1) methods on value types could be marked as mutators, and (2) writing a field of a structure accessed via property would could automatically read the property, edit the temporary structure, and write it back. The approach used here works, though alas I don't know any way to make it generic.
(*) Items of type PointRef should only be returned by properties, and should never be stored in a variable or used as parameters to anything other than a setter property which will convert to a Point.
MustInherit Class PointRef Public MustOverride Property thePoint() As Point Public Property X() As Integer Get Return thePoint.X End Get Set(ByVal value As Integer) Dim mypoint As Point = thePoint mypoint.X = value thePoint = mypoint End Set End Property Public Property Y() As Integer Get Return thePoint.X End Get Set(ByVal value As Integer) Dim mypoint As Point = thePoint mypoint.Y = value thePoint = mypoint End Set End Property Public Shared Widening Operator CType(ByVal val As Point) As PointRef Return New onePoint(val) End Operator Public Shared Widening Operator CType(ByVal val As PointRef) As Point Return val.thePoint End Operator Private Class onePoint Inherits PointRef Dim myPoint As Point Sub New(ByVal pt As Point) myPoint = pt End Sub Public Overrides Property thePoint() As System.Drawing.Point Get Return myPoint End Get Set(ByVal value As System.Drawing.Point) myPoint = value End Set End Property End Class End Class Class pointHolder Dim myPoints As New Dictionary(Of Integer, Point) Private Class IndexedPointRef Inherits PointRef Dim ref As pointHolder Dim index As Integer Sub New(ByVal ref As pointHolder, ByVal index As Integer) Me.ref = ref Me.index = index End Sub Public Overrides Property thePoint() As System.Drawing.Point Get Dim mypoint As New Point(0, 0) ref.myPoints.TryGetValue(index, mypoint) Return mypoint End Get Set(ByVal value As System.Drawing.Point) ref.myPoints(index) = value End Set End Property End Class Default Public Property item(ByVal index As Integer) As PointRef Get Return New IndexedPointRef(Me, index) End Get Set(ByVal value As PointRef) myPoints(index) = value.thePoint End Set End Property Shared Sub test() Dim theH1, theH2 As New pointHolder theH1(5).X = 9 theH1(9).Y = 20 theH2(12).X = theH1(9).Y theH1(20) = theH2(12) theH2(12).Y = 6 Dim h5, h9, h12, h20 As Point h5 = theH1(5) h9 = theH1(9) h12 = theH2(12) h20 = theH1(20) End Sub End Class
If you want to make Cell immutable - as you should if it is a struct - then a good technique is to make a factory that is an instance method on the Cell:
struct C
{
public int Foo { get; private set; }
public int Bar { get; private set; }
private C (int foo, int bar) : this()
{
this.Foo = foo;
this.Bar = bar;
}
public static C Empty = default(C);
public C WithFoo(int foo)
{
return new C(foo, this.Bar);
}
public C WithBar(int bar)
{
return new C(this.Foo, bar);
}
public C IncrementFoo()
{
return new C(this.Foo + 1, bar);
}
// etc
}
...
C c = C.Empty;
c = c.WithFoo(10);
c = c.WithBar(20);
c = c.IncrementFoo();
// c is now 11, 20
So your code would be something like
map[x,y] = map[x,y].IncrementPopulation();
However, I think this is possibly a blind alley; it might be better to simply not have so many Cells around in the first place, rather than trying to optimize a world where there are thousands of them. I'll write up another answer on that.
So there are actually two problems here. There's the question you actually asked: what are techniques to deal with the fact that structs ought to be immutable because they are copied by value, but you want to mutate one. And then there's the question which is motivating this one, which is "how can I make the performance of my program acceptable?"
My other answer addresses the first question, but the second question is interesting as well.
First off, if the profiler has actually identified that the performance problem is due to garbage collection of cells, then it is possible that making cell into a struct will help. It is also possible that it will not help at all, and it is possible that doing so will make it worse.
Your cells do not contain any reference types; we know this because you've said they are only three bytes. If someone else reading this is thinking that they could make a performance optimization by turning a class into a struct then it might not help at all because the class might contain a field of reference type, in which case the garbage collector still has to collect every instance, even if it is turned into a value type. The reference types in it need to be collected too! I would only recommend attempting this for performance reasons if Cell contains only value types, which apparently it does.
It might make it worse because value types are not a panacea; they have costs too. Value types are often more expensive to copy than reference types (which are pretty much always the size of a register, almost always aligned on the appropriate memory boundary, and therefore the chip is highly optimized for copying them). And value types are copied all the time.
Now, in your case you have a struct which is smaller than a reference; references are four or eight bytes typically. And you're putting them in an array, which means that you are packing the array down; if you have a thousand of them, it'll take three thousand bytes. Which means that three out of every four structs in there are misaligned, meaning more time (on many chip architectures) to get the value out of the array. You might consider measuring the impact of padding your struct out to four bytes to see if that makes a difference, provided you're still going to keep them in an array, which brings me to my next point...
The Cell abstraction might simply be a bad abstraction for the purpose of storing data about lots of cells. If the problem is that Cells are classes, you're keeping an array of thousands of Cells, and collecting them is expensive, then there are solutions other than making Cell into a struct. Suppose for example that a Cell contains two bytes of Population and one byte of Color. That is the mechanism of Cell, but surely that is not the interface you want to expose to the users. There is no reason why your mechanism has to use the same type as the interface. And therefore you could manufacture instances of the Cell class on demand:
interface ICell
{
public int Population { get; set; }
public Color Color { get; set; }
}
private class CellMap
{
private ushort[,] populationData; // Profile the memory burden vs speed cost of ushort vs int
private byte[,] colorData; // Same here.
public ICell this[int x, int y]
{
get { return new Cell(this, x, y); }
}
private sealed class Cell : ICell
{
private CellMap map;
private int x;
private int y;
public Cell(CellMap map, int x, int y)
{
this.map = map; // etc
}
public int Population
{
get { return this.map.populationData[this.x, this.y]; }
set { this.map.populationData[this.x, this.y] = (ushort) value; }
}
and so on. Manufacture the cells on demand. They will almost immediately be collected if they are short-lived. CellMap is an abstraction, so use the abstraction to hide the messy implementation details.
With this architecture you don't have any garbage collection problems because you have almost no live Cell instances, but you can still say
map[x,y].Population++;
no problem, because the first indexer manufactures an immutable object which knows how to update the state of the map. The Cell doesn't need to be mutable; notice that the Cell class is completely immutable. (Heck, the Cell could be a struct here, though of course casting it to ICell would just box it anyway.) It is the map which is mutable, and the cell mutates the map for the user.
6 . Use a ref
parameter in a method that mutates the value, call it as IncrementCellPopulation(ref cellMap[x, y])
Encapsulate what you want the CellMap to do, and allow access to the actual array only through appropriate methods like IncrementPopupation(int x, int y)
. Making an array (or any variable, for that matter) public in most cases is a serious code smell, as is returning an array in .NET.
For performance reasons, consider using a single dimensional array; those are way faster in .NET.
If your cell map is in fact "sparse", that is, if there are a lot of adjacent cells that have either no value or some default value, I would suggest that you do not create a cell object for these. Only create objects for cells that actually have some non-default state. (This might reduce the total number of cells by a significant amount, thereby taking pressure off the garbage collector.)
This approach would of course require you to find a new way to store your cell map. You would have to get away from storing your cells in an array (since they are not sparse), and embrace a different kind of data structure, probably a tree.
For example, you could subdivide your map into a number of uniform regions such that you can translate any cell coordinates into a corresponding region. (You could further subdivide each region into sub-regions according to the same idea.) You could then have a search tree per region where the cell coordinates act as key into the tree.
Such a scheme would allow you to only store the cells you need, while still offering fast access to any cell in your map. If no cell at some specified coordinates is found in the trees, it can be assumed that it's a default cell.