Combine two large tables into one table based on unique ID

后端 未结 4 1002
心在旅途
心在旅途 2020-12-22 06:23

To start, I have little knowledge of VBA and have not tried to write a code for what I want to do as I don\'t even know where to start.

I currently have two tables.

相关标签:
4条回答
  • 2020-12-22 07:02

    If you want a VBA solution that doesn't use pivot tables, you can try to create a dictionary object and use the ID as a key and the cash value as the value. like this. You need to add a reference to Microsoft Scripting Runtime first.

    Sub CreateEmployeeSum()
        Dim wb As Workbook
        Set wb = ThisWorkbook
        Dim table1 As Worksheet, _
            table2 As Worksheet, finalTable As Worksheet
        'wasn't sure if you were using sheets of data
        'or actual tables - if they are actual tables,
        'you can loop through those in a similar way, look up
        'on other stackoverflow problems how
    
    
        Set table1 = wb.Sheets("Sheet1") 'first sheet of info
        Set table2 = wb.Sheets("Sheet2") 'second sheet of info
        Set finalTable = wb.Sheets("Sheet3") 'destination sheet
    
    
        'get the last row of both tables
        Dim lastRowT1 As Long, lastRowT2 As Long
        lastRowT1 = table1.Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
        lastRowT2 = table2.Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
    
        'write the info to arrays so faster to loop through
        Dim t1Array As Variant, t2Array As Variant
        t1Array = table1.Range("A1:B" & lastRowT2).Value
        t2Array = table2.Range("A1:B" & lastRowT2).Value
    
        'create a dictionary that maps IDs to cash value
        Dim idToCashDict As Dictionary
        Set idToCashDict = New Dictionary
    
        'first loop through info from first sheet
        Dim i As Long
        For i = 1 To UBound(t1Array)
            Dim idNum As String, cashVal As Double
            idNum = CStr(t1Array(i, 1))
            cashVal = CDbl(t1Array(i, 2))
            If idToCashDict.Exists(idNum) Then
                cashVal = cashVal + idToCashDict.Item(idNum)
                idToCashDict.Remove idNum
                idToCashDict.Add idNum, cashVal
            Else
                idToCashDict.Add idNum, cashVal
            End If
    
        Next i
    
        'then through second sheet, adding to cash value of
        'ids that have been seen before
        For i = 1 To UBound(t2Array)
            Dim idNum2 As String, cashVal2 As Double
            idNum2 = CStr(t2Array(i, 1))
            cashVal2 = CDbl(t2Array(i, 2))
            If idToCashDict.Exists(idNum2) Then
                cashVal2 = cashVal2 + idToCashDict.Item(idNum2)
                idToCashDict.Remove idNum2
                idToCashDict.Add idNum2, cashVal2
            Else
                idToCashDict.Add idNum2, cashVal2
            End If
    
        Next i
    
    
        'then write the entries from the dictionary to the
        'destination sheet
        Dim finalVal As Double, finalID As String
        i = 1
        For Each finalID In idToCashDict.Keys
            finalVal = idToCashDict.Item(finalID)
            finalTable.Range("A" & i).Value = finalID
            finalTable.Range("B" & i).Value = finalVal
            i = i + 1
        Next finalID
    
    
    End Sub
    

    If you use actual tables, see answers such as here in order to loop through the rows in a similar way.

    0 讨论(0)
  • 2020-12-22 07:06

    In the end, I used PivotTable Wizard to combine the ranges in batches of 10,000.

    Thank you for your help.

    0 讨论(0)
  • 2020-12-22 07:06

    Here's an attempt at getting a sorted and combined table. The general strategy I've employed here is: make copies of existing tables and use them to add values, remove repetitive values, and do the same for the third combined table on sheet 3. Attach the following code to a command button.

    Application.ScreenUpdating = False
    Dim i As Long, x As Long, n As Long, j As Long
    Dim cashtotal As Integer
    
    lastrow1 = Sheet1.Range("A1048575").End(xlUp).Row
    astrow2 = Sheet2.Range("A1048575").End(xlUp).Row
    cashtotal = 0
    x = 1
    
    '''''Routine to make a copy of the existing data.
    For i = 1 To lastrow1
        Sheet1.Cells(i, 4) = Sheet1.Cells(i, 1)
        Sheet1.Cells(i, 5) = Sheet1.Cells(i, 2)
    Next
    
    '''''On Sheet1- Routine to remove repetitive values
    For i = 2 To lastrow1
        If Sheet1.Cells(i, 4) = "" Then GoTo 10
          x = x + 1
          cashtotal = Sheet1.Cells(i, 5)
          Sheet1.Cells(x, 7) = Sheet1.Cells(i, 4)
          Sheet1.Cells(x, 8) = Sheet1.Cells(i, 5)
    
            For j = i + 1 To lastrow1
               If Sheet1.Cells(j, 4) = Sheet1.Cells(i, 4) Then
                 cashtotal = cashtotal + Sheet1.Cells(j, 5)
                 Sheet1.Cells(x, 8) = cashtotal
                 Sheet1.Cells(j, 4).ClearContents
                 Sheet1.Cells(j, 5).ClearContents
               End If
            Next
    10
    Next
    x = 1
    
    '''''On Sheet2 the following routine makes a copy of the existing data
    For i = 1 To lastrow2
        Sheet2.Cells(i, 4) = Sheet2.Cells(i, 1)
        Sheet2.Cells(i, 5) = Sheet2.Cells(i, 2)
    Next
    
    '''''On sheet2 -  Routine to remove repetitive values
    For i = 2 To lastrow2
        If Sheet2.Cells(i, 4) = "" Then GoTo 20
           x = x + 1
           cashtotal = Sheet2.Cells(i, 5)
           Sheet2.Cells(x, 7) = Sheet2.Cells(i, 4)
           Sheet2.Cells(x, 8) = Sheet2.Cells(i, 5)
              For j = i + 1 To lastrow2
                If Sheet2.Cells(j, 4) = Sheet2.Cells(i, 4) Then
                  cashtotal = cashtotal + Sheet2.Cells(j, 5)
                  Sheet2.Cells(x, 8) = cashtotal
                  Sheet2.Cells(j, 4).ClearContents
                  Sheet2.Cells(j, 5).ClearContents
                End If
              Next
    20
    Next
    x = 1
    
    '''Transfer modified tables on sheet1 and sheet2 to sheet3 in a combined table
    lastrow4 = Sheet1.Range("G1048575").End(xlUp).Row
    
    For i = 1 To lastrow4
        Sheet3.Cells(i, 1) = Sheet1.Cells(i, 7)
        Sheet3.Cells(i, 2) = Sheet1.Cells(i, 8)
    Next
    
    lastrow5 = Sheet2.Range("G1048575").End(xlUp).Row
    lastrow6 = Sheet3.Range("A1048575").End(xlUp).Row
    
    For i = 2 To lastrow5
        Sheet3.Cells(lastrow6 + i - 1, 1) = Sheet2.Cells(i, 7)
        Sheet3.Cells(lastrow6 + i - 1, 2) = Sheet2.Cells(i, 8)
    Next
    
    '''''''Routine to make a copy of the existing table
    lastrow7 = Sheet3.Range("A1048575").End(xlUp).Row
    
    For i = 1 To lastrow7
        Sheet3.Cells(i, 4) = Sheet3.Cells(i, 1)
        Sheet3.Cells(i, 5) = Sheet3.Cells(i, 2)
    Next
    
    '''''''' Routine to remove repetitive values
    For i = 2 To lastrow7
        If Sheet3.Cells(i, 4) = "" Then GoTo 30
          x = x + 1
          cashtotal = Sheet3.Cells(i, 5)
          Sheet3.Cells(x, 7) = Sheet3.Cells(i, 4)
          Sheet3.Cells(x, 8) = Sheet3.Cells(i, 5)
             For j = i + 1 To lastrow7
                If Sheet3.Cells(j, 4) = Sheet3.Cells(i, 4) Then
                   cashtotal = cashtotal + Sheet3.Cells(j, 5)
                   Sheet3.Cells(x, 8) = cashtotal
    
                   Sheet3.Cells(j, 4).ClearContents
                   Sheet3.Cells(j, 5).ClearContents
                End If
            Next
    30
    Next
    Application.ScreenUpdating = True
    
    0 讨论(0)
  • 2020-12-22 07:16

    I would suggest connecting to the worksheets via an ADO connection and joining the two tables with an SQL statement.

    Add a reference to the Microsoft ActiveX Data Objects library (Tools -> References...) — use the latest version which is usually 6.1.

    Insert a module into the VBA project and paste the following code:

    Sub JoinTables()
    
    Dim connectionString As String
    connectionString = _
        "Provider=Microsoft.ACE.OLEDB.12.0;" & _
        "Data Source=""" & ActiveWorkbook.FullName & """;" & _
        "Extended Properties=""Excel 12.0;HDR=Yes"""
    
    'The SQL statement that shapes the resulting data
    Dim sql As String
    sql = _
        "SELECT t1.ID, t1.Value + IIF(t2.Value IS NULL, 0, t2.Value) AS FinalSum " & _
        "FROM [Sheet1$] AS t1 " & _
        "LEFT JOIN [Sheet2$] AS t2 ON t1.ID = t2.ID " & _
        "UNION SELECT t2.ID, t2.Value " & _
        "FROM [Sheet2$] AS t2 " & _
        "LEFT JOIN [Sheet1$] AS t1 ON t2.ID = t1.ID " & _
        "WHERE t1.ID IS NULL"
    
    Dim rs As New ADODB.Recordset
    'All the fun happens here
    rs.Open sql, connectionString
    
    'Paste the resulting records into the third sheet of the active workbook
    ActiveWorkbook.Sheets(3).Range("A2").CopyFromRecordset rs
    
    Set rs = Nothing
    
    End Sub
    

    Notes:

    • Currently, the recordset is reading data from the current (Excel) workbook. If the data is coming from a database, it may be simpler and more efficient to modify the connection string to connect to the database directly and issue the SQL statement against the database.
    • The code assumes that the first line of each worksheet holds the column labels, e.g. ID and Value. If this is not the case, specify HDR=No in the third line of the connectionString (instead of HDR=Yes), and the fields will be autoassigned names beginning from F1, F2, etc.
    • The results are pasted into the third sheet of the active workbook. This may or may not be appropriate.
    • You don't specify how you want to order the data, but that is simple enough with the addition of an ORDER BY clause to the SQL statement.

    Explanation of the SQL statement

    We are comparing two tables. For a given ID, there could be three possibilities:
    1. the ID appears in both tables,
    2. it appears only in the first table, or
    3. it appears only in the second table.

    We are also working with the assumption that the ID is unique within each table.

    The first half of the statement (up to UNION) handles 1 and 2.

    SELECT t1.ID, t1.Value + IIF(t2.Value IS NULL, 0, t2.Value) AS FinalSum 
    FROM [Sheet1$] AS t1
    LEFT JOIN [Sheet2$] AS t2 ON t1.ID = t2.ID
    

    It can be described as follows:

    Start with the records in the first table — FROM [Sheet1$] AS t1

    Match up each record in the second table to the corresponding record in the first table, based on ID — LEFT JOIN [Sheet2$] AS t2 ON t1.ID = t2.ID

    Include all records from the first table, and only matching records in the second table — the LEFT in LEFT JOIN

    Return two columns: the ID from the first table, and the combination of the values from the first and second table — SELECT ...

    If there is no matching record in the second table, the value will be NULL (not the same as zero). Trying to add a number to NULL will return NULL, which is not what we want. So we have to write this formula — t1.Value + IIF(t2.Value IS NULL, 0, t2.Value):

    • If the value from the second table is null then add 0

    • otherwise add the value from the second table

    The second half of the statement handles IDs that appear only in the second table:

    UNION 
    SELECT t2.ID, t2.Value
    FROM [Sheet2$] AS t2
    LEFT JOIN [Sheet1$] AS t1 ON t2.ID = t1.ID
    WHERE t1.ID IS NULL
    

    Append a second set of results on top of the first set of results — UNION

    Start with the records from the second table — FROM [Sheet2$] AS t2

    Match up the records from the first table to the records in the second table (note this is reversed from the first half of the query) — LEFT JOIN [Sheet1$] AS t1 ON t2.ID = t1.ID

    We only want records that don't have an ID in the first table — WHERE t1.ID IS NULL

    0 讨论(0)
提交回复
热议问题