do you have an idea how to join two tables with the outer join ? Know to do this in SQL, but I need Excel now.
I have a list of all employees in one column I have a
the simple way (maybe the only way?) would be with intermediate cells:
in "results" worksheet:
A, B, C D
fernando, vlookup(...), vlookup(...), =if(ISNA(B2),"<default-1>"), =if(ISNA(B2),"deafult2)
then hide cols C and B
Edit:
Actually there's something close: Pivot Table. You can organize the data in such a way that non-refered cells will stay empty.
But that's a different solution than formulas - it might not fit, depends on your usage.
This method does copy and pasting, filtering, and sorting to accomplish an outer join in Excel and good for just one offs. The idea is to use a VLOOKUP to find all matching records from the left to the right tables and right to the left tables. [Adding another record to table 2 to show outer join]
Table 1
Fernando
Hector
Vivian
Ivan
Table 2
Fernando, task A, 5 hours
Vivian, task B, 8 hours
Thomas, task A, 5 hours
Copy both tables into one table where table 1 will take up the first left columns and first rows and table 2 will take up the last right columns and last rows (The headers should be row 1 for both tables). Create a VLOOKUP function for the next two columns to find matching keys from the left to right tables and right to left tables.
Table 3
Name Name Task Hours Match 1 Match 2
Fernando =VLOOKUP(A2,B:B,1,FALSE) =VLOOKUP(B2,A:A,1,FALSE)
Hector =VLOOKUP(A3,B:B,1,FALSE) =VLOOKUP(B3,A:A,1,FALSE)
Vivian =VLOOKUP(A4,B:B,1,FALSE) =VLOOKUP(B4,A:A,1,FALSE)
Ivan =VLOOKUP(A5,B:B,1,FALSE) =VLOOKUP(B5,A:A,1,FALSE)
Fernando task A 5 hours =VLOOKUP(A6,B:B,1,FALSE) =VLOOKUP(B6,A:A,1,FALSE)
Vivian task B 8 hours =VLOOKUP(A7,B:B,1,FALSE) =VLOOKUP(B7,A:A,1,FALSE)
Thomas task B 8 hours =VLOOKUP(A8,B:B,1,FALSE) =VLOOKUP(B8,A:A,1,FALSE)
Table 3 Result
Name Name Task Hours Match 1 Match 2
Fernando Fernando N/A
Hector N/A N/A
Vivian Vivian N/A
Ivan N/A N/A
Fernando task A 5 hours N/A Fernando
Vivian task B 8 hours N/A Vivian
Thomas task B 8 hours N/A N/A
NOTE: For large data sets, the next step will take a very long time because of the VLOOKUP calculation that happens. Copy and paste over columns for match 1 and match 2 columns as values so that VLOOKUP doesn't recalculate during filtering.
Filter on Match 1 and Match 2 to only see all N/A results. Copy the main data to another sheet with headers.
Name Name Task Hours Match 1 Match 2
Hector N/A N/A
Ivan N/A N/A
Thomas task B 8 hours N/A N/A
Filter on Match 1 and Match 2 to not see N/A results. Sort on keys for both, so that when copy and pasting everything matches. Copy and paste Table 1 data into the new sheet below previously pasted data first. Then copy and paste Table 2 data to the right of the Table 1 data that was just pasted.
Name Name Task Hours Match 1 Match 2
Fernando Fernando N/A
Vivian Vivian N/A
Fernando task A 5 hours N/A Fernando
Vivian task B 8 hours N/A Vivian
The result is below and you can delete, sort, whatever to the outer joined data.
Name Name Task Hours
Hector
Ivan
Thomas task B 8 hours
Fernando Fernando task A 5 hours
Vivian Vivian task B 8 hours
There is indeed such a thing as a left join in Excel if you use ADO.
Go to the VBA editor (Alt-F11) and add a reference (Tools > References) to "Microsoft ActiveX Data Objects 2.8 Library". Create a new normal module (Insert > Module) and add this code:
Option Explicit
Sub get_employees()
Dim cn As ADODB.Connection
Set cn = New ADODB.Connection
' This is the Excel 97-2003 connection string. It should also work with
' Excel 2007 onwards worksheets as long as they have less than 65536
' rows
'With cn
' .Provider = "Microsoft.Jet.OLEDB.4.0"
' .ConnectionString = "Data Source=" & ThisWorkbook.FullName & ";" & _
' "Extended Properties=Excel 8.0;"
' .Open
'End With
With cn
.Provider = "Microsoft.ACE.OLEDB.12.0"
.ConnectionString = "Data Source=" & ThisWorkbook.FullName & ";" & _
"Extended Properties=""Excel 12.0 Macro;IMEX=1;HDR=YES"";"
.Open
End With
Dim rs As ADODB.Recordset
Set rs = New ADODB.Recordset
rs.Open "SELECT * FROM [Sheet1$] LEFT JOIN [Sheet2$] ON [Sheet1$].[EMPLOYEE] = " & _
"[Sheet2$].[EMPLOYEE]", cn
Dim fld As ADODB.Field
Dim i As Integer
With ThisWorkbook.Worksheets("Sheet3")
.UsedRange.ClearContents
i = 0
For Each fld In rs.Fields
i = i + 1
.Cells(1, i).Value = fld.Name
Next fld
.Cells(2, 1).CopyFromRecordset rs
.UsedRange.Columns.AutoFit
End With
rs.Close
cn.Close
End Sub
Save the workbook and then run the code and you should get a left-joined list on Sheet3. You'll see that the Employee column is duplicated but you can sort that out by amending the SELECT clause appropriately. You'll also have blank cells rather than 0 hours where there is no match
edit: I've left the details of the Excel 97-2003 connection string in the code comments but have changed the code to use the Excel 2007 onwards connection string instead. I've also added code to output the field names and autofit the columns after the recordset has been output