Grouping consecutive identical items: IEnumerable to IEnumerable>

前端 未结 4 616
说谎
说谎 2021-02-09 09:38

I\'ve got an interresting problem: Given an IEnumerable, is it possible to yield a sequence of IEnumerable>

4条回答
  •  甜味超标
    2021-02-09 10:36

    Way Better Solution That Meets All Requirements

    OK, scrap my previous solution (I'll leave it below, just for reference). Here's a much better approach that occurred to me after making my initial post.

    Write a new class that implements IEnumerator and provides a few additional properties: IsValid and Previous. This is all you really need to resolve the whole mess with having to maintain state inside an iterator block using yield.

    Here's how I did it (pretty trivial, as you can see):

    internal class ChipmunkEnumerator : IEnumerator {
    
        private readonly IEnumerator _internal;
        private T _previous;
        private bool _isValid;
    
        public ChipmunkEnumerator(IEnumerator e) {
            _internal = e;
            _isValid = false;
        }
    
        public bool IsValid {
            get { return _isValid; }
        }
    
        public T Previous {
            get { return _previous; }
        }
    
        public T Current {
            get { return _internal.Current; }
        }
    
        public bool MoveNext() {
            if (_isValid)
                _previous = _internal.Current;
    
            return (_isValid = _internal.MoveNext());
        }
    
        public void Dispose() {
            _internal.Dispose();
        }
    
        #region Explicit Interface Members
    
        object System.Collections.IEnumerator.Current {
            get { return Current; }
        }
    
        void System.Collections.IEnumerator.Reset() {
            _internal.Reset();
            _previous = default(T);
            _isValid = false;
        }
    
        #endregion
    
    }
    

    (I called this a ChipmunkEnumerator because maintaining the previous value reminded me of how chipmunks have pouches in their cheeks where they keep nuts. Does it really matter? Stop making fun of me.)

    Now, utilizing this class in an extension method to provide exactly the behavior you want isn't so tough!

    Notice that below I've defined GroupConsecutive to actually return an IEnumerable> for the simple reason that, if these are grouped by key anyway, it makes sense to return an IGrouping rather than just an IEnumerable. As it turns out, this will help us out later anyway...

    public static IEnumerable> GroupConsecutive(this IEnumerable source, Func keySelector)
        where TKey : IEquatable {
    
        using (var e = new ChipmunkEnumerator(source.GetEnumerator())) {
            if (!e.MoveNext())
                yield break;
    
            while (e.IsValid) {
                yield return e.GetNextDuplicateGroup(keySelector);
            }
        }
    }
    
    public static IEnumerable> GroupConsecutive(this IEnumerable source)
        where T : IEquatable {
    
        return source.GroupConsecutive(x => x);
    }
    
    private static IGrouping GetNextDuplicateGroup(this ChipmunkEnumerator e, Func keySelector)
        where TKey : IEquatable {
    
        return new Grouping(keySelector(e.Current), e.EnumerateNextDuplicateGroup(keySelector));
    }
    
    private static IEnumerable EnumerateNextDuplicateGroup(this ChipmunkEnumerator e, Func keySelector)
        where TKey : IEquatable {
    
        do {
            yield return e.Current;
    
        } while (e.MoveNext() && keySelector(e.Previous).Equals(keySelector(e.Current)));
    }
    

    (To implement these methods, I wrote a simple Grouping class that implements IGrouping in the most straightforward way possible. I've omitted the code just so as to keep moving along...)

    OK, check it out. I think the code example below pretty well captures something resembling the more realistic scenario you described in your updated question.

    var entries = new List> {
        new KeyValuePair( "Dan", 10 ),
        new KeyValuePair( "Bill", 12 ),
        new KeyValuePair( "Dan", 14 ),
        new KeyValuePair( "Dan", 20 ),
        new KeyValuePair( "John", 1 ),
        new KeyValuePair( "John", 2 ),
        new KeyValuePair( "Bill", 5 )
    };
    
    var dupeGroups = entries
        .GroupConsecutive(entry => entry.Key);
    
    foreach (var dupeGroup in dupeGroups) {
        Console.WriteLine(
            "Key: {0} Sum: {1}",
            dupeGroup.Key.PadRight(5),
            dupeGroup.Select(entry => entry.Value).Sum()
        );
    }
    

    Output:

    Key: Dan   Sum: 10
    Key: Bill  Sum: 12
    Key: Dan   Sum: 34
    Key: John  Sum: 3
    Key: Bill  Sum: 5
    

    Notice this also fixes the problem with my original answer of dealing with IEnumerator objects that were value types. (With this approach, it doesn't matter.)

    There's still going to be a problem if you try calling ToList here, as you will find out if you try it. But considering you included deferred execution as a requirement, I doubt you would be doing that anyway. For a foreach, it works.


    Original, Messy, and Somewhat Stupid Solution

    Something tells me I'm going to get totally refuted for saying this, but...

    Yes, it is possible (I think). See below for a damn messy solution I threw together. (Catches an exception to know when it's finished, so you know it's a great design!)

    Now, Jon's point about there being a very real problem in the event that you try to do, for instance, ToList, and then access the values in the resulting list by index, is totally valid. But if your only intention here is to be able to loop over an IEnumerable using a foreach -- and you're only doing this in your own code -- then, well, I think this could work for you.

    Anyway, here's a quick example of how it works:

    var ints = new int[] { 1, 3, 3, 4, 4, 4, 5, 2, 3, 1, 6, 6, 6, 5, 7, 7, 8 };
    
    var dupeGroups = ints.GroupConsecutiveDuplicates(EqualityComparer.Default);
    
    foreach (var dupeGroup in dupeGroups) {
        Console.WriteLine(
            "New dupe group: " +
            string.Join(", ", dupeGroup.Select(i => i.ToString()).ToArray())
        );
    }
    

    Output:

    New dupe group: 1
    New dupe group: 3, 3
    New dupe group: 4, 4, 4
    New dupe group: 5
    New dupe group: 2
    New dupe group: 3
    New dupe group: 1
    New dupe group: 6, 6, 6
    New dupe group: 5
    New dupe group: 7, 7
    New dupe group: 8
    

    And now for the (messy as crap) code:

    Note that since this approach requires passing the actual enumerator around between a few different methods, it will not work if that enumerator is a value type, as calls to MoveNext in one method are only affecting a local copy.

    public static IEnumerable> GroupConsecutiveDuplicates(this IEnumerable source, IEqualityComparer comparer) {
        using (var e = source.GetEnumerator()) {
            if (e.GetType().IsValueType)
                throw new ArgumentException(
                    "This method will not work on a value type enumerator."
                );
    
            // get the ball rolling
            if (!e.MoveNext()) {
                yield break;
            }
    
            IEnumerable nextDuplicateGroup;
    
            while (e.FindMoreDuplicates(comparer, out nextDuplicateGroup)) {
                yield return nextDuplicateGroup;
            }
        }
    }
    
    private static bool FindMoreDuplicates(this IEnumerator enumerator, IEqualityComparer comparer, out IEnumerable duplicates) {
        duplicates = enumerator.GetMoreDuplicates(comparer);
    
        return duplicates != null;
    }
    
    private static IEnumerable GetMoreDuplicates(this IEnumerator enumerator, IEqualityComparer comparer) {
        try {
            if (enumerator.Current != null)
                return enumerator.GetMoreDuplicatesInner(comparer);
            else
                return null;
    
        } catch (InvalidOperationException) {
            return null;
        }
    }
    
    private static IEnumerable GetMoreDuplicatesInner(this IEnumerator enumerator, IEqualityComparer comparer) {
        while (enumerator.Current != null) {
            var current = enumerator.Current;
            yield return current;
    
            if (!enumerator.MoveNext())
                break;
    
            if (!comparer.Equals(current, enumerator.Current))
                break;
        }
    }
    

提交回复
热议问题