When I execute the following query:
SELECT * FROM `table1`
INNER JOIN table2 ON table2.number = table1.number
I get the result within 2 s
Instead of this you can use this method:
SELECT * FROM `table1`
LEFT JOIN table2 ON table2.number = table1.number
WHERE
table2.number is NULL OR table1.number is NULL
Let's say your first INNER JOIN
returns 75% of the 1,000,000 rows in table1
. The second query does not return the 250,000 other rows as you think. Instead, it attempts to create a Cartesian product and remove the 750,000 matching rows. Thus it's trying to return 6,000,000×1,000,000-750,000 rows. That's a bulging 6×1012 row result set.
You probably want this:
SELECT * FROM table1
LEFT JOIN table2 ON table2.number = table1.number
WHERE table2.number IS NULL
This returns rows in table1
not present in table2
.
You might also be interested in FULL OUTER JOIN
:
SELECT * FROM table1
FULL OUTER JOIN table2 ON table2.number = table1.number
WHERE table1.number IS NULL AND table2.number IS NULL
This returns rows in both tables that don't have a match on the other table.
The reason this isn't working is cause your basiclly joining every row of table1 with every row with table 2. You need something to still join it with. The best way to do this is to do a left join (Meaning it'll join table1 no matter what, but not table2) and then check to make sure there isn't an entry for table2 with the is null. You'll then need to do the same thing for table2.
SELECT * FROM `table1`
LEFT JOIN table2 ON table2.number = table1.number
WHERE table2.number is NULL
UNION
SELECT * FROM `table2`
LEFT JOIN table1 ON table2.number = table1.number
WHERE table1.number is NULL