How to deal with the immutability of returned structs?

前端 未结 6 1877

I\'m writing a game that has a huge 2D array of \"cells\". A cell takes only 3 bytes. I also have a class called CellMap, which contains the 2D array as a private field, and pro

相关标签:
6条回答
  • 2021-02-04 17:18

    Eric Lippert's approach is good, but I would suggest using a base class rather than an interface for the indirect accessor. The following program demonstrates a class which acts like a sparse array of points. Provided that one never persists any item of type PointRef(*), things should work beautifully. Saying:

      MyPointHolder(123) = somePoint
    

    or

      MyPointHolder(123).thePoint = somePoint
    

    will both create a temporary pointRef object (a pointRef.onePoint in one case; a pointHolder.IndexedPointRef in the other) but the widening typecasts work to maintain value semantics. Of course, things would have been much easier if (1) methods on value types could be marked as mutators, and (2) writing a field of a structure accessed via property would could automatically read the property, edit the temporary structure, and write it back. The approach used here works, though alas I don't know any way to make it generic.

    (*) Items of type PointRef should only be returned by properties, and should never be stored in a variable or used as parameters to anything other than a setter property which will convert to a Point.

    MustInherit Class PointRef
        Public MustOverride Property thePoint() As Point
        Public Property X() As Integer
            Get
                Return thePoint.X
            End Get
            Set(ByVal value As Integer)
                Dim mypoint As Point = thePoint
                mypoint.X = value
                thePoint = mypoint
            End Set
        End Property
        Public Property Y() As Integer
            Get
                Return thePoint.X
            End Get
            Set(ByVal value As Integer)
                Dim mypoint As Point = thePoint
                mypoint.Y = value
                thePoint = mypoint
            End Set
        End Property
        Public Shared Widening Operator CType(ByVal val As Point) As PointRef
            Return New onePoint(val)
        End Operator
        Public Shared Widening Operator CType(ByVal val As PointRef) As Point
            Return val.thePoint
        End Operator
        Private Class onePoint
            Inherits PointRef
    
            Dim myPoint As Point
    
            Sub New(ByVal pt As Point)
                myPoint = pt
            End Sub
    
            Public Overrides Property thePoint() As System.Drawing.Point
                Get
                    Return myPoint
                End Get
                Set(ByVal value As System.Drawing.Point)
                    myPoint = value
                End Set
            End Property
        End Class
    End Class
    
    
    Class pointHolder
        Dim myPoints As New Dictionary(Of Integer, Point)
        Private Class IndexedPointRef
            Inherits PointRef
    
            Dim ref As pointHolder
            Dim index As Integer
            Sub New(ByVal ref As pointHolder, ByVal index As Integer)
                Me.ref = ref
                Me.index = index
            End Sub
            Public Overrides Property thePoint() As System.Drawing.Point
                Get
                    Dim mypoint As New Point(0, 0)
                    ref.myPoints.TryGetValue(index, mypoint)
                    Return mypoint
                End Get
                Set(ByVal value As System.Drawing.Point)
                    ref.myPoints(index) = value
                End Set
            End Property
        End Class
    
        Default Public Property item(ByVal index As Integer) As PointRef
            Get
                Return New IndexedPointRef(Me, index)
            End Get
            Set(ByVal value As PointRef)
                myPoints(index) = value.thePoint
            End Set
        End Property
    
        Shared Sub test()
            Dim theH1, theH2 As New pointHolder
            theH1(5).X = 9
            theH1(9).Y = 20
            theH2(12).X = theH1(9).Y
            theH1(20) = theH2(12)
            theH2(12).Y = 6
            Dim h5, h9, h12, h20 As Point
            h5 = theH1(5)
            h9 = theH1(9)
            h12 = theH2(12)
            h20 = theH1(20)
        End Sub
    End Class
    
    0 讨论(0)
  • 2021-02-04 17:19

    If you want to make Cell immutable - as you should if it is a struct - then a good technique is to make a factory that is an instance method on the Cell:

    struct C
    {
        public int Foo { get; private set; }
        public int Bar { get; private set; }
        private C (int foo, int bar) : this()
        {
            this.Foo = foo;
            this.Bar = bar;
        }
        public static C Empty = default(C);
        public C WithFoo(int foo)
        {
            return new C(foo, this.Bar);
        }
        public C WithBar(int bar)
        {
            return new C(this.Foo, bar);
        }
        public C IncrementFoo()
        {
            return new C(this.Foo + 1, bar);
        }
        // etc
    }
    ...
    C c = C.Empty;
    c = c.WithFoo(10);
    c = c.WithBar(20);
    c = c.IncrementFoo();
    // c is now 11, 20
    

    So your code would be something like

    map[x,y] = map[x,y].IncrementPopulation();
    

    However, I think this is possibly a blind alley; it might be better to simply not have so many Cells around in the first place, rather than trying to optimize a world where there are thousands of them. I'll write up another answer on that.

    0 讨论(0)
  • 2021-02-04 17:20

    So there are actually two problems here. There's the question you actually asked: what are techniques to deal with the fact that structs ought to be immutable because they are copied by value, but you want to mutate one. And then there's the question which is motivating this one, which is "how can I make the performance of my program acceptable?"

    My other answer addresses the first question, but the second question is interesting as well.

    First off, if the profiler has actually identified that the performance problem is due to garbage collection of cells, then it is possible that making cell into a struct will help. It is also possible that it will not help at all, and it is possible that doing so will make it worse.

    Your cells do not contain any reference types; we know this because you've said they are only three bytes. If someone else reading this is thinking that they could make a performance optimization by turning a class into a struct then it might not help at all because the class might contain a field of reference type, in which case the garbage collector still has to collect every instance, even if it is turned into a value type. The reference types in it need to be collected too! I would only recommend attempting this for performance reasons if Cell contains only value types, which apparently it does.

    It might make it worse because value types are not a panacea; they have costs too. Value types are often more expensive to copy than reference types (which are pretty much always the size of a register, almost always aligned on the appropriate memory boundary, and therefore the chip is highly optimized for copying them). And value types are copied all the time.

    Now, in your case you have a struct which is smaller than a reference; references are four or eight bytes typically. And you're putting them in an array, which means that you are packing the array down; if you have a thousand of them, it'll take three thousand bytes. Which means that three out of every four structs in there are misaligned, meaning more time (on many chip architectures) to get the value out of the array. You might consider measuring the impact of padding your struct out to four bytes to see if that makes a difference, provided you're still going to keep them in an array, which brings me to my next point...

    The Cell abstraction might simply be a bad abstraction for the purpose of storing data about lots of cells. If the problem is that Cells are classes, you're keeping an array of thousands of Cells, and collecting them is expensive, then there are solutions other than making Cell into a struct. Suppose for example that a Cell contains two bytes of Population and one byte of Color. That is the mechanism of Cell, but surely that is not the interface you want to expose to the users. There is no reason why your mechanism has to use the same type as the interface. And therefore you could manufacture instances of the Cell class on demand:

    interface ICell
    {
       public int Population { get; set; }
       public Color Color { get; set; }
    }
    private class CellMap
    {
        private ushort[,] populationData; // Profile the memory burden vs speed cost of ushort vs int
        private byte[,] colorData; // Same here. 
        public ICell this[int x, int y] 
        {
            get { return new Cell(this, x, y); }
        }
    
        private sealed class Cell : ICell
        {
            private CellMap map;
            private int x;
            private int y;
            public Cell(CellMap map, int x, int y)
            {
                this.map = map; // etc
            }
            public int Population  
            {
                get { return this.map.populationData[this.x, this.y]; } 
                set { this.map.populationData[this.x, this.y] = (ushort) value; } 
            }
    

    and so on. Manufacture the cells on demand. They will almost immediately be collected if they are short-lived. CellMap is an abstraction, so use the abstraction to hide the messy implementation details.

    With this architecture you don't have any garbage collection problems because you have almost no live Cell instances, but you can still say

    map[x,y].Population++;
    

    no problem, because the first indexer manufactures an immutable object which knows how to update the state of the map. The Cell doesn't need to be mutable; notice that the Cell class is completely immutable. (Heck, the Cell could be a struct here, though of course casting it to ICell would just box it anyway.) It is the map which is mutable, and the cell mutates the map for the user.

    0 讨论(0)
  • 2021-02-04 17:25

    6 . Use a ref parameter in a method that mutates the value, call it as IncrementCellPopulation(ref cellMap[x, y])

    0 讨论(0)
  • 2021-02-04 17:29

    Encapsulate what you want the CellMap to do, and allow access to the actual array only through appropriate methods like IncrementPopupation(int x, int y). Making an array (or any variable, for that matter) public in most cases is a serious code smell, as is returning an array in .NET.

    For performance reasons, consider using a single dimensional array; those are way faster in .NET.

    0 讨论(0)
  • 2021-02-04 17:31

    If your cell map is in fact "sparse", that is, if there are a lot of adjacent cells that have either no value or some default value, I would suggest that you do not create a cell object for these. Only create objects for cells that actually have some non-default state. (This might reduce the total number of cells by a significant amount, thereby taking pressure off the garbage collector.)

    This approach would of course require you to find a new way to store your cell map. You would have to get away from storing your cells in an array (since they are not sparse), and embrace a different kind of data structure, probably a tree.

    For example, you could subdivide your map into a number of uniform regions such that you can translate any cell coordinates into a corresponding region. (You could further subdivide each region into sub-regions according to the same idea.) You could then have a search tree per region where the cell coordinates act as key into the tree.

    Such a scheme would allow you to only store the cells you need, while still offering fast access to any cell in your map. If no cell at some specified coordinates is found in the trees, it can be assumed that it's a default cell.

    0 讨论(0)
提交回复
热议问题