How to Left Outer Join two DataTables in c#?

后端 未结 3 1681
青春惊慌失措
青春惊慌失措 2020-12-09 09:46

How can I Left Outer Join (I think it is Left Outer Join but I am not 100% sure) two data tables with the following tables and conditions while keeping all columns from both

相关标签:
3条回答
  • 2020-12-09 10:02

    This is simply an inner join between the 2 tables:

    var query = (from x in a.AsEnumerable()
                  join y in b.AsEnumerable() on x.Field<int>("col1") equals y.Field<int>("col1")
                  select new { col1= y.Field<int>("col1"), col2=x.Field<int>("col2") }).ToList();
    

    Produces:

    col1 col2
    1    Hi 
    1    Hi 
    2    Bye 
    3    Later 
    3    Later 
    3    Later 
    
    0 讨论(0)
  • 2020-12-09 10:06

    You could probably use LINQ and do something like this:

    var dtblJoined = from dB in dtblB.AsEnumerable()
                     join dA in dtblA.AsEnumerable() on dA.col1 equals dB.col1 into dAB
                     from d in dAB.DefaultIfEmpty()
                     select new (col1 = dB.col1, ; col2 = (dB.col1 == dA.col1) ? dA.col2 : null);
    

    This would return an IEnumerable as the result not a DataTable, but it should get you closer to what you are looking for I think. May need a little tweaking though.

    0 讨论(0)
  • 2020-12-09 10:11

    Thanks all for your help. Here is what I came up with based on multiple resources:

    public static class DataTableHelper
    {
        public enum JoinType
        {
            /// <summary>
            /// Same as regular join. Inner join produces only the set of records that match in both Table A and Table B.
            /// </summary>
            Inner = 0,
            /// <summary>
            /// Same as Left Outer join. Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.
            /// </summary>
            Left = 1
        }
    
        /// <summary>
        /// Joins the passed in DataTables on the colToJoinOn.
        /// <para>Returns an appropriate DataTable with zero rows if the colToJoinOn does not exist in both tables.</para>
        /// </summary>
        /// <param name="dtblLeft"></param>
        /// <param name="dtblRight"></param>
        /// <param name="colToJoinOn"></param>
        /// <param name="joinType"></param>
        /// <returns></returns>
        /// <remarks>
        /// <para>http://stackoverflow.com/questions/2379747/create-combined-datatable-from-two-datatables-joined-with-linq-c-sharp?rq=1</para>
        /// <para>http://msdn.microsoft.com/en-us/library/vstudio/bb397895.aspx</para>
        /// <para>http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html</para>
        /// <para>http://stackoverflow.com/questions/406294/left-join-and-left-outer-join-in-sql-server</para>
        /// </remarks>
        public static DataTable JoinTwoDataTablesOnOneColumn(DataTable dtblLeft, DataTable dtblRight, string colToJoinOn, JoinType joinType)
        {
            //Change column name to a temp name so the LINQ for getting row data will work properly.
            string strTempColName = colToJoinOn + "_2";
            if (dtblRight.Columns.Contains(colToJoinOn))
                dtblRight.Columns[colToJoinOn].ColumnName = strTempColName;
    
            //Get columns from dtblLeft
            DataTable dtblResult = dtblLeft.Clone();
    
            //Get columns from dtblRight
            var dt2Columns = dtblRight.Columns.OfType<DataColumn>().Select(dc => new DataColumn(dc.ColumnName, dc.DataType, dc.Expression, dc.ColumnMapping));
    
            //Get columns from dtblRight that are not in dtblLeft
            var dt2FinalColumns = from dc in dt2Columns.AsEnumerable()
                                  where !dtblResult.Columns.Contains(dc.ColumnName)
                                  select dc;
    
            //Add the rest of the columns to dtblResult
            dtblResult.Columns.AddRange(dt2FinalColumns.ToArray());
    
            //No reason to continue if the colToJoinOn does not exist in both DataTables.
            if (!dtblLeft.Columns.Contains(colToJoinOn) || (!dtblRight.Columns.Contains(colToJoinOn) && !dtblRight.Columns.Contains(strTempColName)))
            {
                if (!dtblResult.Columns.Contains(colToJoinOn))
                    dtblResult.Columns.Add(colToJoinOn);
                return dtblResult;
            }
    
            switch (joinType)
            {
    
                default:
                case JoinType.Inner:
                    #region Inner
                    //get row data
                    //To use the DataTable.AsEnumerable() extension method you need to add a reference to the System.Data.DataSetExtension assembly in your project. 
                    var rowDataLeftInner = from rowLeft in dtblLeft.AsEnumerable()
                                           join rowRight in dtblRight.AsEnumerable() on rowLeft[colToJoinOn] equals rowRight[strTempColName]
                                           select rowLeft.ItemArray.Concat(rowRight.ItemArray).ToArray();
    
    
                    //Add row data to dtblResult
                    foreach (object[] values in rowDataLeftInner)
                        dtblResult.Rows.Add(values);
    
                    #endregion
                    break;
                case JoinType.Left:
                    #region Left
                    var rowDataLeftOuter = from rowLeft in dtblLeft.AsEnumerable()
                                           join rowRight in dtblRight.AsEnumerable() on rowLeft[colToJoinOn] equals rowRight[strTempColName] into gj
                                           from subRight in gj.DefaultIfEmpty()
                                           select rowLeft.ItemArray.Concat((subRight== null) ? (dtblRight.NewRow().ItemArray) :subRight.ItemArray).ToArray();
    
    
                    //Add row data to dtblResult
                    foreach (object[] values in rowDataLeftOuter)
                        dtblResult.Rows.Add(values);
    
                    #endregion
                    break;
            }
    
            //Change column name back to original
            dtblRight.Columns[strTempColName].ColumnName = colToJoinOn;
    
            //Remove extra column from result
            dtblResult.Columns.Remove(strTempColName);
    
            return dtblResult;
        }
    }
    

    EDIT 3:

    This method now works correctly and it is still fast when the tables have 2000+ rows. Any recommendations/suggestions/improvements would be appreciated.

    EDIT 4:

    I had a certain scenario that led me to realize the previous version was really doing an inner join. The function has been modified to fix that problem. I used info at this link to figure it out.

    0 讨论(0)
提交回复
热议问题