Left Join outperforming Inner Join?

后端未结

关注

 6  1957

I\'ve been profiling some queries in an application I\'m working on, and I came across a query that was retrieving more rows than necessary, the result set being trimmed dow

相关标签:

6条回答

渐次进展

2021-01-01 20:02

Try this:

SELECT `contacts`.*, `lists`.`name` AS `group`, `lists`.`id` AS `group_id`, `lists`.`shared_yn`, `tags`.`name` AS `context`, `tags`.`id` AS `context_id`, `tags`.`color` AS `context_color`, `users`.`id` AS `user_id`, `users`.`avatar` 
FROM `contacts`  
INNER JOIN `users` ON contacts.user_id='1' AND users.email=contacts.email
LEFT JOIN `lists` ON lists.id=contacts.list_id  
LEFT JOIN `lists_to_users` ON lists_to_users.user_id='1' AND lists_to_users.creator='1' AND lists_to_users.list_id=lists.id
LEFT JOIN `tags` ON tags.id=lists_to_users.tag_id 
ORDER BY `contacts`.`name` ASC

That should give you an extra performance because:

You put all the inner joins before any "left" or "right" join appears. This filters out some records before applying the subsequent outer joins
The short-circuit of the "AND" operators (order of the "AND" matters). If the comparition between the columns and the literals is false, it won't execute the required table scan for the comparition between the tables PKs and FKs

If you don't find any performance improvement, then replace all the columnset for a "COUNT(*)" and do your left/inner tests. This way, regardless of the query, you will retrieve only 1 single row with 1 single column (the count), so you can discard that the number of returned bytes is the cause of the slowness of your query:

SELECT COUNT(*)
FROM `contacts`  
INNER JOIN `users` ON contacts.user_id='1' AND users.email=contacts.email
LEFT JOIN `lists` ON lists.id=contacts.list_id  
LEFT JOIN `lists_to_users` ON lists_to_users.user_id='1' AND lists_to_users.creator='1' AND lists_to_users.list_id=lists.id
LEFT JOIN `tags` ON tags.id=lists_to_users.tag_id

Good luck

0 讨论(0)

梦毁少年i

2021-01-01 20:03

imo you are falling into the pitfall known as premature optimization. Query optimizers are insanely fickle things. My suggestion, is to move on until you can identify for sure that the a particular join is problematic.

0 讨论(0)
发布评论:

提交评论
- 加载中...
暗喜

2021-01-01 20:05

It's probably due to the INNER JOIN having to check each row in both tables to see if the column values (email in your case) match. The LEFT JOIN will return all from one table regardless. If it's indexed then it will know what to do faster too.

0 讨论(0)
发布评论:

提交评论
- 加载中...
滥情空心

2021-01-01 20:07
If you think that the implementation of LEFT JOIN is INNER JOIN + more work, then this result is confusing. What if the implementation of INNER JOIN is (LEFT JOIN + filtering)? Ah, it is clear now.

In the query plans, the only difference is this: users... extra: using where . This means filtering. There's an extra filtering step in the query with the inner join.

This is a different kind of filtering than is typically used in a where clause. It is simple to create an index on A to support this filtering action.
```
SELECT *
FROM A
WHERE A.ID = 3
```
Consider this query:
```
SELECT *
FROM A
  LEFT JOIN B
  ON A.ID = B.ID
WHERE B.ID is not null
```
This query is equivalent to inner join. There is no index on B that will help that filtering action. The reason is that the where clause is stating a condition on the result of the join, instead of a condition on B.
0 讨论(0)
发布评论:

提交评论
- 加载中...
囚心锁ツ

2021-01-01 20:16

Table cardinality has an influence on the query optimizer. I guess small tables as you have make the inner join the more complex operation. As soon as you have more records than the DB server is willing to keep in memory, the inner join will probably begin to outperform the left join.

0 讨论(0)
发布评论:

提交评论
- 加载中...
执念已碎

2021-01-01 20:23
LEFT JOIN is returning more rows than INNER JOIN because these 2 are different.
If LEFT JOIN does not find related entry in the table it is looking for, it will return NULLs for the table.
But if INNER JOIN does not find related entry, it will not return the whole row at all.

But to your question, do you have query_cache enabled? Try running the query with
```
SELECT SQL_NO_CACHE `contacts`.*, ...
```
Other than that, I'd populate the tables with more data, ran
```
ANALYZE TABLE t1, t2;
OPTIMIZE TABLE t1, t2;
```
And see what happens.
0 讨论(0)
发布评论:

提交评论
- 加载中...