How to deal with the immutability of returned structs?

前端未结

关注

 6  1877

I\'m writing a game that has a huge 2D array of \"cells\". A cell takes only 3 bytes. I also have a class called CellMap, which contains the 2D array as a private field, and pro

相关标签:

6条回答

离开以前

2021-02-04 17:18

Eric Lippert's approach is good, but I would suggest using a base class rather than an interface for the indirect accessor. The following program demonstrates a class which acts like a sparse array of points. Provided that one never persists any item of type PointRef(*), things should work beautifully. Saying:

  MyPointHolder(123) = somePoint

  MyPointHolder(123).thePoint = somePoint

will both create a temporary pointRef object (a pointRef.onePoint in one case; a pointHolder.IndexedPointRef in the other) but the widening typecasts work to maintain value semantics. Of course, things would have been much easier if (1) methods on value types could be marked as mutators, and (2) writing a field of a structure accessed via property would could automatically read the property, edit the temporary structure, and write it back. The approach used here works, though alas I don't know any way to make it generic.

(*) Items of type PointRef should only be returned by properties, and should never be stored in a variable or used as parameters to anything other than a setter property which will convert to a Point.

MustInherit Class PointRef
    Public MustOverride Property thePoint() As Point
    Public Property X() As Integer
        Get
            Return thePoint.X
        End Get
        Set(ByVal value As Integer)
            Dim mypoint As Point = thePoint
            mypoint.X = value
            thePoint = mypoint
        End Set
    End Property
    Public Property Y() As Integer
        Get
            Return thePoint.X
        End Get
        Set(ByVal value As Integer)
            Dim mypoint As Point = thePoint
            mypoint.Y = value
            thePoint = mypoint
        End Set
    End Property
    Public Shared Widening Operator CType(ByVal val As Point) As PointRef
        Return New onePoint(val)
    End Operator
    Public Shared Widening Operator CType(ByVal val As PointRef) As Point
        Return val.thePoint
    End Operator
    Private Class onePoint
        Inherits PointRef

        Dim myPoint As Point

        Sub New(ByVal pt As Point)
            myPoint = pt
        End Sub

        Public Overrides Property thePoint() As System.Drawing.Point
            Get
                Return myPoint
            End Get
            Set(ByVal value As System.Drawing.Point)
                myPoint = value
            End Set
        End Property
    End Class
End Class


Class pointHolder
    Dim myPoints As New Dictionary(Of Integer, Point)
    Private Class IndexedPointRef
        Inherits PointRef

        Dim ref As pointHolder
        Dim index As Integer
        Sub New(ByVal ref As pointHolder, ByVal index As Integer)
            Me.ref = ref
            Me.index = index
        End Sub
        Public Overrides Property thePoint() As System.Drawing.Point
            Get
                Dim mypoint As New Point(0, 0)
                ref.myPoints.TryGetValue(index, mypoint)
                Return mypoint
            End Get
            Set(ByVal value As System.Drawing.Point)
                ref.myPoints(index) = value
            End Set
        End Property
    End Class

    Default Public Property item(ByVal index As Integer) As PointRef
        Get
            Return New IndexedPointRef(Me, index)
        End Get
        Set(ByVal value As PointRef)
            myPoints(index) = value.thePoint
        End Set
    End Property

    Shared Sub test()
        Dim theH1, theH2 As New pointHolder
        theH1(5).X = 9
        theH1(9).Y = 20
        theH2(12).X = theH1(9).Y
        theH1(20) = theH2(12)
        theH2(12).Y = 6
        Dim h5, h9, h12, h20 As Point
        h5 = theH1(5)
        h9 = theH1(9)
        h12 = theH2(12)
        h20 = theH1(20)
    End Sub
End Class

0 讨论(0)

死守一世寂寞

2021-02-04 17:19

If you want to make Cell immutable - as you should if it is a struct - then a good technique is to make a factory that is an instance method on the Cell:

struct C
{
    public int Foo { get; private set; }
    public int Bar { get; private set; }
    private C (int foo, int bar) : this()
    {
        this.Foo = foo;
        this.Bar = bar;
    }
    public static C Empty = default(C);
    public C WithFoo(int foo)
    {
        return new C(foo, this.Bar);
    }
    public C WithBar(int bar)
    {
        return new C(this.Foo, bar);
    }
    public C IncrementFoo()
    {
        return new C(this.Foo + 1, bar);
    }
    // etc
}
...
C c = C.Empty;
c = c.WithFoo(10);
c = c.WithBar(20);
c = c.IncrementFoo();
// c is now 11, 20

So your code would be something like

map[x,y] = map[x,y].IncrementPopulation();

However, I think this is possibly a blind alley; it might be better to simply not have so many Cells around in the first place, rather than trying to optimize a world where there are thousands of them. I'll write up another answer on that.

0 讨论(0)

有刺的猬

2021-02-04 17:20
So there are actually two problems here. There's the question you actually asked: what are techniques to deal with the fact that structs ought to be immutable because they are copied by value, but you want to mutate one. And then there's the question which is motivating this one, which is "how can I make the performance of my program acceptable?"

My other answer addresses the first question, but the second question is interesting as well.

First off, if the profiler has actually identified that the performance problem is due to garbage collection of cells, then it is possible that making cell into a struct will help. It is also possible that it will not help at all, and it is possible that doing so will make it worse.

Your cells do not contain any reference types; we know this because you've said they are only three bytes. If someone else reading this is thinking that they could make a performance optimization by turning a class into a struct then it might not help at all because the class might contain a field of reference type, in which case the garbage collector still has to collect every instance, even if it is turned into a value type. The reference types in it need to be collected too! I would only recommend attempting this for performance reasons if Cell contains only value types, which apparently it does.

It might make it worse because value types are not a panacea; they have costs too. Value types are often more expensive to copy than reference types (which are pretty much always the size of a register, almost always aligned on the appropriate memory boundary, and therefore the chip is highly optimized for copying them). And value types are copied all the time.

Now, in your case you have a struct which is smaller than a reference; references are four or eight bytes typically. And you're putting them in an array, which means that you are packing the array down; if you have a thousand of them, it'll take three thousand bytes. Which means that three out of every four structs in there are misaligned, meaning more time (on many chip architectures) to get the value out of the array. You might consider measuring the impact of padding your struct out to four bytes to see if that makes a difference, provided you're still going to keep them in an array, which brings me to my next point...

The Cell abstraction might simply be a bad abstraction for the purpose of storing data about lots of cells. If the problem is that Cells are classes, you're keeping an array of thousands of Cells, and collecting them is expensive, then there are solutions other than making Cell into a struct. Suppose for example that a Cell contains two bytes of Population and one byte of Color. That is the mechanism of Cell, but surely that is not the interface you want to expose to the users. There is no reason why your mechanism has to use the same type as the interface. And therefore you could manufacture instances of the Cell class on demand:
```
interface ICell
{
   public int Population { get; set; }
   public Color Color { get; set; }
}
private class CellMap
{
    private ushort[,] populationData; // Profile the memory burden vs speed cost of ushort vs int
    private byte[,] colorData; // Same here. 
    public ICell this[int x, int y] 
    {
        get { return new Cell(this, x, y); }
    }

    private sealed class Cell : ICell
    {
        private CellMap map;
        private int x;
        private int y;
        public Cell(CellMap map, int x, int y)
        {
            this.map = map; // etc
        }
        public int Population  
        {
            get { return this.map.populationData[this.x, this.y]; } 
            set { this.map.populationData[this.x, this.y] = (ushort) value; } 
        }
```
and so on. Manufacture the cells on demand. They will almost immediately be collected if they are short-lived. CellMap is an abstraction, so use the abstraction to hide the messy implementation details.

With this architecture you don't have any garbage collection problems because you have almost no live Cell instances, but you can still say
```
map[x,y].Population++;
```
no problem, because the first indexer manufactures an immutable object which knows how to update the state of the map. The Cell doesn't need to be mutable; notice that the Cell class is completely immutable. (Heck, the Cell could be a struct here, though of course casting it to ICell would just box it anyway.) It is the map which is mutable, and the cell mutates the map for the user.
0 讨论(0)
发布评论:

提交评论
- 加载中...
星月不相逢

2021-02-04 17:25

6 . Use a ref parameter in a method that mutates the value, call it as IncrementCellPopulation(ref cellMap[x, y])

0 讨论(0)
发布评论:

提交评论
- 加载中...
眼角桃花

2021-02-04 17:29

Encapsulate what you want the CellMap to do, and allow access to the actual array only through appropriate methods like IncrementPopupation(int x, int y). Making an array (or any variable, for that matter) public in most cases is a serious code smell, as is returning an array in .NET.

For performance reasons, consider using a single dimensional array; those are way faster in .NET.

0 讨论(0)
发布评论:

提交评论
- 加载中...
不知归路

2021-02-04 17:31

If your cell map is in fact "sparse", that is, if there are a lot of adjacent cells that have either no value or some default value, I would suggest that you do not create a cell object for these. Only create objects for cells that actually have some non-default state. (This might reduce the total number of cells by a significant amount, thereby taking pressure off the garbage collector.)

This approach would of course require you to find a new way to store your cell map. You would have to get away from storing your cells in an array (since they are not sparse), and embrace a different kind of data structure, probably a tree.

For example, you could subdivide your map into a number of uniform regions such that you can translate any cell coordinates into a corresponding region. (You could further subdivide each region into sub-regions according to the same idea.) You could then have a search tree per region where the cell coordinates act as key into the tree.

Such a scheme would allow you to only store the cells you need, while still offering fast access to any cell in your map. If no cell at some specified coordinates is found in the trees, it can be assumed that it's a default cell.

0 讨论(0)
发布评论:

提交评论
- 加载中...