Why do BCL Collections use struct enumerators, not classes?

懵懂的女人 提交于 2019-11-26 18:28:48

Indeed, it is for performance reasons. The BCL team did a lot of research on this point before deciding to go with what you rightly call out as a suspicious and dangerous practice: the use of a mutable value type.

You ask why this doesn't cause boxing. It's because the C# compiler does not generate code to box stuff to IEnumerable or IEnumerator in a foreach loop if it can avoid it!

When we see

foreach(X x in c)

the first thing we do is check to see if c has a method called GetEnumerator. If it does, then we check to see whether the type it returns has method MoveNext and property current. If it does, then the foreach loop is generated entirely using direct calls to those methods and properties. Only if "the pattern" cannot be matched do we fall back to looking for the interfaces.

This has two desirable effects.

First, if the collection is, say, a collection of ints, but was written before generic types were invented, then it does not take the boxing penalty of boxing the value of Current to object and then unboxing it to int. If Current is a property that returns an int, we just use it.

Second, if the enumerator is a value type then it does not box the enumerator to IEnumerator.

Like I said, the BCL team did a lot of research on this and discovered that the vast majority of the time, the penalty of allocating and deallocating the enumerator was large enough that it was worth making it a value type, even though doing so can cause some crazy bugs.

For example, consider this:

struct MyHandle : IDisposable { ... }
...
using (MyHandle h = whatever)
{
    h = somethingElse;
}

You would quite rightly expect the attempt to mutate h to fail, and indeed it does. The compiler detects that you are trying to change the value of something that has a pending disposal, and that doing so might cause the object that needs to be disposed to actually not be disposed.

Now suppose you had:

struct MyHandle : IDisposable { ... }
...
using (MyHandle h = whatever)
{
    h.Mutate();
}

What happens here? You might reasonably expect that the compiler would do what it does if h were a readonly field: make a copy, and mutate the copy in order to ensure that the method does not throw away stuff in the value that needs to be disposed.

However, that conflicts with our intuition about what ought to happen here:

using (Enumerator enumtor = whatever)
{
    ...
    enumtor.MoveNext();
    ...
}

We expect that doing a MoveNext inside a using block will move the enumerator to the next one regardless of whether it is a struct or a ref type.

Unfortunately, the C# compiler today has a bug. If you are in this situation we choose which strategy to follow inconsistently. The behaviour today is:

  • if the value-typed variable being mutated via a method is a normal local then it is mutated normally

  • but if it is a hoisted local (because it's a closed-over variable of an anonymous function or in an iterator block) then the local is actually generated as a read-only field, and the gear that ensures that mutations happen on a copy takes over.

Unfortunately the spec provides little guidance on this matter. Clearly something is broken because we're doing it inconsistently, but what the right thing to do is not at all clear.

Struct methods are inlined when type of struct is known at compile time, and calling method via interface is slow, so answer is: because of performance reason.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!