Left Join outperforming Inner Join?

后端 未结 6 1956
不思量自难忘°
不思量自难忘° 2021-01-01 19:56

I\'ve been profiling some queries in an application I\'m working on, and I came across a query that was retrieving more rows than necessary, the result set being trimmed dow

相关标签:
6条回答
  • 2021-01-01 20:02

    Try this:

    SELECT `contacts`.*, `lists`.`name` AS `group`, `lists`.`id` AS `group_id`, `lists`.`shared_yn`, `tags`.`name` AS `context`, `tags`.`id` AS `context_id`, `tags`.`color` AS `context_color`, `users`.`id` AS `user_id`, `users`.`avatar` 
    FROM `contacts`  
    INNER JOIN `users` ON contacts.user_id='1' AND users.email=contacts.email
    LEFT JOIN `lists` ON lists.id=contacts.list_id  
    LEFT JOIN `lists_to_users` ON lists_to_users.user_id='1' AND lists_to_users.creator='1' AND lists_to_users.list_id=lists.id
    LEFT JOIN `tags` ON tags.id=lists_to_users.tag_id 
    ORDER BY `contacts`.`name` ASC
    

    That should give you an extra performance because:

    • You put all the inner joins before any "left" or "right" join appears. This filters out some records before applying the subsequent outer joins
    • The short-circuit of the "AND" operators (order of the "AND" matters). If the comparition between the columns and the literals is false, it won't execute the required table scan for the comparition between the tables PKs and FKs

    If you don't find any performance improvement, then replace all the columnset for a "COUNT(*)" and do your left/inner tests. This way, regardless of the query, you will retrieve only 1 single row with 1 single column (the count), so you can discard that the number of returned bytes is the cause of the slowness of your query:

    SELECT COUNT(*)
    FROM `contacts`  
    INNER JOIN `users` ON contacts.user_id='1' AND users.email=contacts.email
    LEFT JOIN `lists` ON lists.id=contacts.list_id  
    LEFT JOIN `lists_to_users` ON lists_to_users.user_id='1' AND lists_to_users.creator='1' AND lists_to_users.list_id=lists.id
    LEFT JOIN `tags` ON tags.id=lists_to_users.tag_id 
    

    Good luck

    0 讨论(0)
  • 2021-01-01 20:03

    imo you are falling into the pitfall known as premature optimization. Query optimizers are insanely fickle things. My suggestion, is to move on until you can identify for sure that the a particular join is problematic.

    0 讨论(0)
  • 2021-01-01 20:05

    It's probably due to the INNER JOIN having to check each row in both tables to see if the column values (email in your case) match. The LEFT JOIN will return all from one table regardless. If it's indexed then it will know what to do faster too.

    0 讨论(0)
  • 2021-01-01 20:07

    If you think that the implementation of LEFT JOIN is INNER JOIN + more work, then this result is confusing. What if the implementation of INNER JOIN is (LEFT JOIN + filtering)? Ah, it is clear now.

    In the query plans, the only difference is this: users... extra: using where . This means filtering. There's an extra filtering step in the query with the inner join.


    This is a different kind of filtering than is typically used in a where clause. It is simple to create an index on A to support this filtering action.

    SELECT *
    FROM A
    WHERE A.ID = 3
    

    Consider this query:

    SELECT *
    FROM A
      LEFT JOIN B
      ON A.ID = B.ID
    WHERE B.ID is not null
    

    This query is equivalent to inner join. There is no index on B that will help that filtering action. The reason is that the where clause is stating a condition on the result of the join, instead of a condition on B.

    0 讨论(0)
  • 2021-01-01 20:16

    Table cardinality has an influence on the query optimizer. I guess small tables as you have make the inner join the more complex operation. As soon as you have more records than the DB server is willing to keep in memory, the inner join will probably begin to outperform the left join.

    0 讨论(0)
  • 2021-01-01 20:23

    LEFT JOIN is returning more rows than INNER JOIN because these 2 are different.
    If LEFT JOIN does not find related entry in the table it is looking for, it will return NULLs for the table.
    But if INNER JOIN does not find related entry, it will not return the whole row at all.

    But to your question, do you have query_cache enabled? Try running the query with

    SELECT SQL_NO_CACHE `contacts`.*, ...
    

    Other than that, I'd populate the tables with more data, ran

    ANALYZE TABLE t1, t2;
    OPTIMIZE TABLE t1, t2;
    

    And see what happens.

    0 讨论(0)
提交回复
热议问题