LINQ - Full Outer Join

后端 未结 16 1566
既然无缘
既然无缘 2020-11-21 22:45

I have a list of people\'s ID and their first name, and a list of people\'s ID and their surname. Some people don\'t have a first name and some don\'t have a surname; I\'d l

16条回答
  •  北荒
    北荒 (楼主)
    2020-11-21 23:22

    Update 1: providing a truly generalized extension method FullOuterJoin
    Update 2: optionally accepting a custom IEqualityComparer for the key type
    Update 3: this implementation has recently become part of MoreLinq - Thanks guys!

    Edit Added FullOuterGroupJoin (ideone). I reused the GetOuter<> implementation, making this a fraction less performant than it could be, but I'm aiming for 'highlevel' code, not bleeding-edge optimized, right now.

    See it live on http://ideone.com/O36nWc

    static void Main(string[] args)
    {
        var ax = new[] { 
            new { id = 1, name = "John" },
            new { id = 2, name = "Sue" } };
        var bx = new[] { 
            new { id = 1, surname = "Doe" },
            new { id = 3, surname = "Smith" } };
    
        ax.FullOuterJoin(bx, a => a.id, b => b.id, (a, b, id) => new {a, b})
            .ToList().ForEach(Console.WriteLine);
    }
    

    Prints the output:

    { a = { id = 1, name = John }, b = { id = 1, surname = Doe } }
    { a = { id = 2, name = Sue }, b =  }
    { a = , b = { id = 3, surname = Smith } }
    

    You could also supply defaults: http://ideone.com/kG4kqO

        ax.FullOuterJoin(
                bx, a => a.id, b => b.id, 
                (a, b, id) => new { a.name, b.surname },
                new { id = -1, name    = "(no firstname)" },
                new { id = -2, surname = "(no surname)" }
            )
    

    Printing:

    { name = John, surname = Doe }
    { name = Sue, surname = (no surname) }
    { name = (no firstname), surname = Smith }
    

    Explanation of terms used:

    Joining is a term borrowed from relational database design:

    • A join will repeat elements from a as many times as there are elements in b with corresponding key (i.e.: nothing if b were empty). Database lingo calls this inner (equi)join.
    • An outer join includes elements from a for which no corresponding element exists in b. (i.e.: even results if b were empty). This is usually referred to as left join.
    • A full outer join includes records from a as well as b if no corresponding element exists in the other. (i.e. even results if a were empty)

    Something not usually seen in RDBMS is a group join[1]:

    • A group join, does the same as described above, but instead of repeating elements from a for multiple corresponding b, it groups the records with corresponding keys. This is often more convenient when you wish to enumerate through 'joined' records, based on a common key.

    See also GroupJoin which contains some general background explanations as well.


    [1] (I believe Oracle and MSSQL have proprietary extensions for this)

    Full code

    A generalized 'drop-in' Extension class for this

    internal static class MyExtensions
    {
        internal static IEnumerable FullOuterGroupJoin(
            this IEnumerable a,
            IEnumerable b,
            Func selectKeyA, 
            Func selectKeyB,
            Func, IEnumerable, TKey, TResult> projection,
            IEqualityComparer cmp = null)
        {
            cmp = cmp?? EqualityComparer.Default;
            var alookup = a.ToLookup(selectKeyA, cmp);
            var blookup = b.ToLookup(selectKeyB, cmp);
    
            var keys = new HashSet(alookup.Select(p => p.Key), cmp);
            keys.UnionWith(blookup.Select(p => p.Key));
    
            var join = from key in keys
                       let xa = alookup[key]
                       let xb = blookup[key]
                       select projection(xa, xb, key);
    
            return join;
        }
    
        internal static IEnumerable FullOuterJoin(
            this IEnumerable a,
            IEnumerable b,
            Func selectKeyA, 
            Func selectKeyB,
            Func projection,
            TA defaultA = default(TA), 
            TB defaultB = default(TB),
            IEqualityComparer cmp = null)
        {
            cmp = cmp?? EqualityComparer.Default;
            var alookup = a.ToLookup(selectKeyA, cmp);
            var blookup = b.ToLookup(selectKeyB, cmp);
    
            var keys = new HashSet(alookup.Select(p => p.Key), cmp);
            keys.UnionWith(blookup.Select(p => p.Key));
    
            var join = from key in keys
                       from xa in alookup[key].DefaultIfEmpty(defaultA)
                       from xb in blookup[key].DefaultIfEmpty(defaultB)
                       select projection(xa, xb, key);
    
            return join;
        }
    }
    

提交回复
热议问题