Here is my table structure: SQL Fiddle
CREATE TABLE mytable (
id int,
related int
);
INSERT into mytable VALUES(1, NULL);
INSERT into mytable VALUES
LEFT JOIN
means to grab all the rows from the left table, and only if there is a match from the right table, to return those. If there is no match, NULL
is returned.
Let's look at what the dataset looks like when joined to itself with no condition. (Note, the asterisks and pluses are referred to below.)
+-------+------------+-------+------------+
| t1.id | t1.related | t2.id | t2.related |
+-------+------------+-------+------------+
| 1 | NULL | 1 | NULL |
+| 1 | NULL | 2 | 1 |
+| 1 | NULL | 4 | 1 |
*| 2 | 1 | 1 | NULL |
| 2 | 1 | 2 | 1 |
| 2 | 1 | 4 | 1 |
*| 4 | 1 | 1 | NULL |
| 4 | 1 | 2 | 1 |
| 4 | 1 | 4 | 1 |
+-------+------------+-------+------------+
The left table is t1
in both cases.
In the first query, we look for matches of t1.related = t2.id
(the middle two columns). That corresponds to the rows marked with an asterisk. However, there was no match for t.id = 1
, so we need to include this, but with NULL
because that's what LEFT JOIN
means (no match still returns NULL
).
+-------+-------+
| t1.id | t2.id |
+-------+-------+
| 1 | NULL | added because LEFT JOIN
| 2 | 1 | marked with * in table above
| 4 | 1 | marked with * in table above
+-------+-------+
In the second query, we look for matches of t1.id = t2.related
(the outer two columns). That corresponds to the rows marked with a plus. However, there was no match for t.id = 2
and t.id = 4
, so we need to include these, but with NULL
because that's what LEFT JOIN
means (no match still returns NULL
).
+-------+-------+
| t1.id | t2.id |
+-------+-------+
| 1 | 2 | marked with + in table above
| 1 | 4 | marked with + in table above
| 2 | NULL | added because LEFT JOIN
| 4 | NULL | added because LEFT JOIN
+-------+-------+