Why is my query returning far to many results?

空扰寡人 提交于 2021-02-09 11:40:50

问题


I have a bunch of candidates, who have had one or more jobs, each with a company, using some skills.

Bad ascii art follows:

         ---------------                              ---------------   
         | candidate 1 |                              | candidate 2 |
         --------------- \                             --------------      
           /              \                                 |
       -------              --------                        etc
       |job 1|              | job 2 |  
       -------              ---------  
        /     \              /      \  
  ---------   ---------  ---------   --------  
  |company |  | skills | |company | | skills |  
  ---------   ---------  ---------- ----------  

Here's my database:

mysql> describe jobs;
+--------------+---------+------+-----+---------+----------------+
| Field        | Type    | Null | Key | Default | Extra          |
+--------------+---------+------+-----+---------+----------------+
| job_id       | int(11) | NO   | PRI | NULL    | auto_increment |
| candidate_id | int(11) | NO   | MUL | NULL    |                |
| company_id   | int(11) | NO   | MUL | NULL    |                |
| start_date   | date    | NO   | MUL | NULL    |                |
| end_date     | date    | NO   | MUL | NULL    |                |
+--------------+---------+------+-----+---------+----------------+

.

mysql> describe candidates;
+----------------+----------+------+-----+---------+----------------+
| Field          | Type     | Null | Key | Default | Extra          |
+----------------+----------+------+-----+---------+----------------+
| candidate_id   | int(11)  | NO   | PRI | NULL    | auto_increment |
| candidate_name | char(50) | NO   | MUL | NULL    |                |
| home_city      | char(50) | NO   | MUL | NULL    |                |
+----------------+----------+------+-----+---------+----------------+

.

mysql> describe companies;
+-------------------+---------------+------+-----+---------+----------------+

| Field             | Type          | Null | Key | Default | Extra          |
+-------------------+---------------+------+-----+---------+----------------+
| company_id        | int(11)       | NO   | PRI | NULL    | auto_increment |
| company_name      | char(50)      | NO   | MUL | NULL    |                |
| company_city      | char(50)      | NO   | MUL | NULL    |                |
| company_post_code | char(50)      | NO   |     | NULL    |                |
| latitude          | decimal(11,8) | NO   |     | NULL    |                |
| longitude         | decimal(11,8) | NO   |     | NULL    |                |
+-------------------+---------------+------+-----+---------+----------------+

.

mysql> describe skills;
+----------+---------+------+-----+---------+-------+
| Field    | Type    | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+-------+
| skill_id | int(11) | NO   | MUL | NULL    |       |
| job_id   | int(11) | NO   | MUL | NULL    |       |
+----------+---------+------+-----+---------+-------+

.

mysql> describe skill_names;
+------------+----------+------+-----+---------+----------------+
| Field      | Type     | Null | Key | Default | Extra          |
+------------+----------+------+-----+---------+----------------+
| skill_id   | int(11)  | NO   | PRI | NULL    | auto_increment |
| skill_name | char(32) | NO   | MUL | NULL    |                |
+------------+----------+------+-----+---------+----------------+

Here's my attempt at a query (note, I intend to change the wildcards to field names; I am just trying to get something working):

SELECT can.* , co.*, j.*, sn.*
FROM candidates AS can
JOIN jobs AS j 
JOIN companies AS co ON j.company_id = co.company_id
JOIN skills AS s ON s.job_id = j.job_id
JOIN skill_names AS sn ON s.skill_id = s.skill_id

HediSql says /* Affected rows: 0 Found rows: 34,461,651 Warnings: 0 Duration for 1 query: 0.000 sec. (+ 105.078 sec. network) */

What's wrong with the query? I hope that it is clear from the bad ascii art what I am trying to achieve.

(also, does it make any speed difference which order I join the tables? I will worry about the new MySql v8 functions which retrieve it as nested JSON later)


回答1:


You are missing a join condition between candidates and jobs, so you get a cartesian product between both tables. Also, there is a problem with the join condition on skill_names, where both columns are the same (this again generates a cartesian product).

SELECT can.* , co.*, j.*, sn.*
FROM candidates AS can
JOIN jobs AS j ON j.candidate_id = can.candidate_id --> here: missing join condition
JOIN companies AS co ON j.company_id = co.company_id
JOIN skills AS s ON s.job_id = j.job_id
JOIN skill_names AS sn ON sn.skill_id = s.skill_id  --> and here: wrong join condition

Many RDBMS would raise a syntax error on a JOIN without an ON clause (if you do want a cartesian product, you need to be explicit about it by using CROSS JOIN), but, alas, not MySQL.


When it comes to this question:

does it make any speed difference which order I join the tables?

No. As long as you are using inner joins (not left joins), the join order does not matter to the query planner, which will rearrange them in the order which it thinks is the more efficient.



来源:https://stackoverflow.com/questions/59790324/why-is-my-query-returning-far-to-many-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!