I have a list of people\'s ID and their first name, and a list of people\'s ID and their surname. Some people don\'t have a first name and some don\'t have a surname; I\'d l
Update 1: providing a truly generalized extension method FullOuterJoin
Update 2: optionally accepting a custom IEqualityComparer
for the key type
Update 3: this implementation has recently become part of MoreLinq - Thanks guys!
Edit Added FullOuterGroupJoin
(ideone). I reused the GetOuter<>
implementation, making this a fraction less performant than it could be, but I'm aiming for 'highlevel' code, not bleeding-edge optimized, right now.
See it live on http://ideone.com/O36nWc
static void Main(string[] args)
{
var ax = new[] {
new { id = 1, name = "John" },
new { id = 2, name = "Sue" } };
var bx = new[] {
new { id = 1, surname = "Doe" },
new { id = 3, surname = "Smith" } };
ax.FullOuterJoin(bx, a => a.id, b => b.id, (a, b, id) => new {a, b})
.ToList().ForEach(Console.WriteLine);
}
Prints the output:
{ a = { id = 1, name = John }, b = { id = 1, surname = Doe } }
{ a = { id = 2, name = Sue }, b = }
{ a = , b = { id = 3, surname = Smith } }
You could also supply defaults: http://ideone.com/kG4kqO
ax.FullOuterJoin(
bx, a => a.id, b => b.id,
(a, b, id) => new { a.name, b.surname },
new { id = -1, name = "(no firstname)" },
new { id = -2, surname = "(no surname)" }
)
Printing:
{ name = John, surname = Doe }
{ name = Sue, surname = (no surname) }
{ name = (no firstname), surname = Smith }
Joining is a term borrowed from relational database design:
a
as many times as there are elements in b
with corresponding key (i.e.: nothing if b
were empty). Database lingo calls this inner (equi)join
.a
for which no corresponding
element exists in b
. (i.e.: even results if b
were empty). This is usually referred to as left join
.a
as well as b
if no corresponding element exists in the other. (i.e. even results if a
were empty)Something not usually seen in RDBMS is a group join[1]:
a
for multiple corresponding b
, it groups the records with corresponding keys. This is often more convenient when you wish to enumerate through 'joined' records, based on a common key.See also GroupJoin which contains some general background explanations as well.
[1] (I believe Oracle and MSSQL have proprietary extensions for this)
A generalized 'drop-in' Extension class for this
internal static class MyExtensions
{
internal static IEnumerable FullOuterGroupJoin(
this IEnumerable a,
IEnumerable b,
Func selectKeyA,
Func selectKeyB,
Func, IEnumerable, TKey, TResult> projection,
IEqualityComparer cmp = null)
{
cmp = cmp?? EqualityComparer.Default;
var alookup = a.ToLookup(selectKeyA, cmp);
var blookup = b.ToLookup(selectKeyB, cmp);
var keys = new HashSet(alookup.Select(p => p.Key), cmp);
keys.UnionWith(blookup.Select(p => p.Key));
var join = from key in keys
let xa = alookup[key]
let xb = blookup[key]
select projection(xa, xb, key);
return join;
}
internal static IEnumerable FullOuterJoin(
this IEnumerable a,
IEnumerable b,
Func selectKeyA,
Func selectKeyB,
Func projection,
TA defaultA = default(TA),
TB defaultB = default(TB),
IEqualityComparer cmp = null)
{
cmp = cmp?? EqualityComparer.Default;
var alookup = a.ToLookup(selectKeyA, cmp);
var blookup = b.ToLookup(selectKeyB, cmp);
var keys = new HashSet(alookup.Select(p => p.Key), cmp);
keys.UnionWith(blookup.Select(p => p.Key));
var join = from key in keys
from xa in alookup[key].DefaultIfEmpty(defaultA)
from xb in blookup[key].DefaultIfEmpty(defaultB)
select projection(xa, xb, key);
return join;
}
}