Why isn't Skip() in LINQ to objects optimized?

问题

var res = new int[1000000].Skip(999999).First();

It would be great if this query would just use the indexer instead of traversing 999999 entries.

I had a look into the System.Core.dll and noticed that in contrast to Skip(), the Count() extension method is optimized. If the IEnumerable implements ICollection then it just calls the Count property.

回答1:

If you look at my answer to a similar question, it appears as though it should be easy to provide a non-naive (i.e. throws proper exceptions) optimization of Skip for any IList:

    public static IEnumerable<T> Skip<T>(this IList<T> source, int count)
    {
        using (var e = source.GetEnumerator())
            while (count < source.Count && e.MoveNext())
                yield return source[count++];
    }

Of course, your example uses an array. Since arrays don't throw exceptions during iteration, even doing anything as complicated as my function would be unnecessary. One could thus conclude that MS didn't optimize it because they didn't think of it or they didn't think it was a common enough case to be worth optimizing.

回答2:

I'll let Jon Skeet answer this:

If our sequence is a list, we can just skip straight to the right part of it and yield the items one at a time. That sounds great, but what if the list changes (or is even truncated!) while we're iterating over it? An implementation working with the simple iterator would usually throw an exception, as the change would invalidate the iterator. This is definitely a behavioural change. When I first wrote about Skip, I included this as a "possible" optimization - and actually turned it on in the Edulinq source code. I now believe it to be a mistake, and have removed it completely.

...

The problem with both of these "optimizations" is arguably that they're applying list-based optimizations within an iterator block used for deferred execution. Optimizing for lists either upfront at the point of the initial method call or within an immediate execution operator (Count, ToList etc) is fine, because we assume the sequence won't change during the course of the method's execution. We can't make that assumption with an iterator block, because the flow of the code is very different: our code is visited repeatedly based on the caller's use of MoveNext().

https://msmvps.com/blogs/jon_skeet/archive/2011/01/26/reimplementing-linq-to-objects-part-40-optimization.aspx

来源：https://stackoverflow.com/questions/6245172/why-isnt-skip-in-linq-to-objects-optimized

标签

.net

linq

optimization

linq-to-objects