I have a question on Union
and Concat
. I guess both are behaving same in case of List
.
var a1 = (new[] { 1,
Union returns Distinct
values. By default it will compare references of items. Your items have different references, thus they all are considered different. When you cast to base type X
, reference is not changed.
If you will override Equals
and GetHashCode
(used to select distinct items), then items will not be compared by reference:
class X
{
public int ID { get; set; }
public override bool Equals(object obj)
{
X x = obj as X;
if (x == null)
return false;
return x.ID == ID;
}
public override int GetHashCode()
{
return ID.GetHashCode();
}
}
But all your items have different value of ID
. So all items still considered different. If you will provide several items with same ID
then you will see difference between Union
and Concat
:
var lstX1 = new List<X1> { new X1 { ID = 1, ID1 = 10 },
new X1 { ID = 10, ID1 = 100 } };
var lstX2 = new List<X2> { new X2 { ID = 1, ID2 = 20 }, // ID changed here
new X2 { ID = 20, ID2 = 200 } };
var a5 = lstX1.Cast<X>().Union(lstX2.Cast<X>()); // 3 distinct items
var a6 = lstX1.Cast<X>().Concat(lstX2.Cast<X>()); // 4
Your initial sample works, because integers are value types and they are compared by value.
Union
and Concat
behave the same since Union
can not detect duplicates without a custom IEqualityComparer<X>
. It's just looking if both are the same reference.
public class XComparer: IEqualityComparer<X>
{
public bool Equals(X x1, X x2)
{
if (object.ReferenceEquals(x1, x2))
return true;
if (x1 == null || x2 == null)
return false;
return x1.ID.Equals(x2.ID);
}
public int GetHashCode(X x)
{
return x.ID.GetHashCode();
}
}
Now you can use it in the overload of Union
:
var comparer = new XComparer();
a5 = lstX1.Cast<X>().Union(lstX2.Cast<X>(), new XComparer());
Concat
literally returns the items from the first sequence followed by the items from the second sequence. If you use Concat
on two 2-item sequences, you will always get a 4-item sequence.
Union
is essentially Concat
followed by Distinct
.
In your first two cases, you end up with 2-item sequences because, between them, each pair of input squences has exactly two distinct items.
In your third case, you end up with a 4-item sequence because all four items in your two input sequences are distinct.