I am learning SQL and am trying to learn JOINs this week.
I have gotten to the level where I can do three table joins, similar to a lot of examples I\'ve seen. I\'
Selecting from three tables is no different from selecting from only two (or as many as a hundred, though that would be a fairly ugly query to read).
For EACH join you write, having INNER
indicates that you only want rows that successfully join those two tables together. If other tables were joined earlier in the query, those results are now completely irrelevant, except to the extent your own join conditions call on them.
For example:
SELECT person.*
FROM person
LEFT JOIN vehicle ON (person.person_id = vehicle.owner_id)
LEFT JOIN house ON (person.person_id = house.owner_id)
Here I want a list of all people, and (if available) all the vehicles and houses they own.
Alternatively:
SELECT person.*
FROM person
INNER JOIN vehicle ON (person.person_id = vehicle.owner_id)
LEFT JOIN house ON (person.person_id = house.owner_id)
Here I want all people who own vehicles (they must own a vehicle to get results in my query), and (if available) all the houses they own).
Each join is completely separate here.
Of course, by varying what you put in the ON
clause, you can make joins interrelate tables any way you want.
On some sql engines there's an issue where you're joining a using left join. If you join A->B->C and the row in B doesn't exist then the join column from B is NULL. A few I've used require that the join from B->C must be a left join if the join from A->B is a left join.
This is ok
select a.*, b.*, c.*
from a
left join b on b.id = a.id
left join c on c.id = b.id
this is not
select a.*, b.*, c.*
from a
left join b on b.id = a.id
inner join c on c.id = b.id
For the sake of this example, lets say we have a table "employees" with an ID, NAME and MANAGER_ID fields.
Here is a simple query:
SELECT E.ID, E.NAME, M.NAME AS MANAGER
FROM EMPLOYEES E
JOIN EMPLOYEE M ON E.MANAGER_ID = M.ID
This will return all of the employees, with their manager name. But what happens for the boss? he who has no manager? A database null would actually prevent that row from returning as it could not find a matching record to join on. Thus you would use an OUTER join (left or right depending on how your write the query).
The same logic would hold for writing a query with 2+n joins. If you are possibly going to have rows that don't have matches in your join clause, and want those rows to come back (albeit with nulls), then you are golden.
For the sake of completeness and standard evangelics, I'll chime in with the ansi-92 nested join syntax:
select t1.*
,t2.*
,t3.*
from table1 t1
left outer join (
table2 t2 left outer join table3 t3 on (t2.b = t3.b)
) on (t1.a = t2.a)
Your SQL engine of choice may optimize for them.
This really depends on what you are doing. I've written many 3+ table queries that will have an outer join in them. It just depends on the data you are querying and what you are trying to follow.
The same general logic applies when selecting the join type when you have multiples as with single join queries.
Read this great article on outer joins by a well known expert Terry Purcell
also a great write up by Plamen Ratchev