Finding symmetric difference with LINQ

后端 未结 3 1054
心在旅途
心在旅途 2020-12-01 14:13

I have two collections a and b. I would like to compute the set of items in either a or b, but not in both (a logical exc

相关标签:
3条回答
  • 2020-12-01 14:42

    Use HashSet<T> directly - it has a SymmetricExceptWith method:

    HashSet<T> data = new HashSet<T>(a);
    data.SymmetricExceptWith(b);
    

    EDIT: If you want to maintain the order, here's an alternative:

    HashSet<T> data = new HashSet<T>(a);
    data.IntersectWith(b);
    foreach (T t in a.Concat(b))
    {
        if (!data.Contains(t))
        {
            yield return t;
        }
    }
    

    This has the following important differences:

    • Both a and b are iterated over twice. In some cases that could be a very bad thing - you could call ToList on each of them to start with to retain a buffer.
    • If there are duplicates in either a or b, they will be yielded multiple times. If you wanted to avoid this you could keep a set of already-yielded values. At this point, it would be equivalent to:

      a.Concat(b).Except(a.Intersect(b))
      

    That's still only two set operations instead of the three in your original code though.

    0 讨论(0)
  • 2020-12-01 14:49

    We had a similar need for a project in my company, so we wrote this extension:

    public class EnumerablePair<T> : IReadOnlyCollection<T>
    {
        private IReadOnlyCollection<T> _Left;
        private IReadOnlyCollection<T> _Right;
        private IEnumerable<T> _Union;
        private int _Count;
        public EnumerablePair(IEnumerable<T> left, IEnumerable<T> right)
        {
            _Left = left?.ToList() ?? Enumerable.Empty<T>().ToList();
            _Right = right?.ToList() ?? Enumerable.Empty<T>().ToList();
            _Count = Left.Count + Right.Count;
            _Union = Left.Union(Right);
        }
    
        public int Count => _Count;
        public IReadOnlyCollection<T> Left { get => _Left; }
        public IReadOnlyCollection<T> Right { get => _Right; }
    
        public IEnumerator<T> GetEnumerator()
        {
            return _Union.GetEnumerator();
        }
    
        IEnumerator IEnumerable.GetEnumerator()
        {
            return _Union.GetEnumerator();
        }
    }
    
    public static class EnumerableExtension
    {
        public static EnumerablePair<T> ExclusiveDisjunction<T>(this IEnumerable<T> leftOperand, IEnumerable<T> rightOperand, IEqualityComparer<T> comparer = null)
        {
            if (leftOperand == null)
                throw new ArgumentNullException(nameof(leftOperand), $"{nameof(leftOperand)} is null.");
            if (rightOperand == null)
                throw new ArgumentNullException(nameof(rightOperand), $"{nameof(rightOperand)} is null.");
    
            // TODO : Can be optimized if one of the IEnumerable parameters is empty.
    
            bool leftIsBigger = leftOperand.Count() > rightOperand.Count();
            var biggestOperand = leftIsBigger ? leftOperand.ToList() : rightOperand.ToList();
            var smallestOperand = leftIsBigger ? rightOperand.ToList() : leftOperand.ToList();
    
            var except1 = biggestOperand.ToList();
            var except2 = Enumerable.Empty<T>().ToList();
    
            Func<T, T, bool> areEquals;
            if (comparer != null)
                areEquals = (one, theOther) => comparer.Equals(one, theOther);
            else
                areEquals = (one, theOther) => one?.Equals(theOther) ?? theOther == null;
    
            foreach (T t in smallestOperand)
                if (except1.RemoveAll(item => areEquals(item, t)) == 0)
                    except2.Add(t);
    
            if (leftIsBigger)
                return new EnumerablePair<T>(except1, except2);
            return new EnumerablePair<T>(except2, except1);
        }
    }
    

    It compares elements of two collections (using an IEqualityComparer or not, at your choice).

    • The returned object, an EnumerablePair<T>, contains objects that are in leftOperand or rightOperand, but not both (XOR).
    • EnumerablePair<T>.Left contains objects that are in leftOperand but not in rightOperand.
    • EnumerablePair<T>.Right contains objects that are in rightOperand but not in leftOperand.

    You can use the extension like this :

    var xorList = list1.ExclusiveDisjunction(list2);
    var leftXor = xorList.Left;
    var rightXor = xorList.Right;
    

    xorList, leftXor and rightXor are IEnumerable<T>.

    0 讨论(0)
  • 2020-12-01 14:51

    Given a.Except(b) and b.Except(a) are disjoint, you can use concat instead of union, saving a set operator (and concat is more efficient).

    return a.Except (b).Concat (b.Except (a));
    

    This still runs through each list twice.

    0 讨论(0)
提交回复
热议问题