In what order does execution on WHERE and ON clauses work?

后端 未结 4 734
隐瞒了意图╮
隐瞒了意图╮ 2021-01-23 16:41

I was reading this page about APPLY:

http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/07/07/using-cross-apply-to-optimize-joins-on-between-conditions.aspx

相关标签:
4条回答
  • 2021-01-23 17:09

    There is no definitive "before" and "after" on these queries. RDBMS is allowed to decide when to run what part of the query, as long as the results the query produces do not change.

    In the first case, there is nothing the query can do to pre-filter the rows of Commercials, because the WHERE clause constrains only the rows of the Calls. These constraints specified a range for c.AirTime in terms of the corresponding row of Commercials, so no pre-filtering is possible: all rows of Calls would be considered for each row of Commercials.

    In the second case, however, RDBMS can improve on the time by observing that you additionally constraint the range for c.AirTime to between 23:45 on Jun-30, 2008 through midnight of Jul-1, 2008 by constraining s.StartedAt to which c.AirTime is joined. This can allow the optimizer use an index, if one is defined on the Calls.AirTime column.

    The important observation here is that the RDBMS can do very clever things when optimizing your query. It arrives at the optimized strategy by applying multiple rules of logic, trying to push the constraints closer to the "source of rows" in a join. The best option to checking what the optimizer does is reading the query plan.

    0 讨论(0)
  • 2021-01-23 17:15

    A massive amount of logic, time, blood, sweat, and tears have gone into the SQL Server Engine Optimizer, which is what determines the query plan that determines how a statement is actually processed. What is written in a statement in no way reflects what actually executes in the engine.

    To really see what's going on, run your queries with the show actual query plan option enabled. My guess is that based on the additional where clause the data is being pre-filtered by the optimizer.

    0 讨论(0)
  • 2021-01-23 17:17

    The second query is faster why you are limiting the scope of the join.

    First query: A join B

    Second query: A join subset(B)

    As subset(B) < B itself there are a lot less matches to scan for.

    And that leads to the question: the column used in that join got a index? (Probably not or the speeds cannot differ a lot)

    0 讨论(0)
  • 2021-01-23 17:18

    They are not the same queries so why would you expect the same response times

    If the two queries are returning a different number of rows then use a top X for a more fair comparison

    Query optimizer can get very smart (and it can get stupid)
    View the query plan to see what is going on

    My experience is the query optimize has a better chance of getting smart if you pull the conditions into the join

    SELECT s.StartedAt, s.EndedAt, c.AirTime 
     FROM dbo.Commercials s 
     JOIN dbo.Calls c  
       ON c.AirTime >= s.StartedAt 
      AND c.AirTime < s.EndedAt 
      AND c.AirTime BETWEEN '20080701' AND '20080701 03:00' 
      AND s.StartedAt BETWEEN '20080630 23:45' AND '20080701 03:00'
    

    If you just have a single join then the query optimizer may move a where early
    But if you have multiple joins I have never seen the query optimizer move a where early

    0 讨论(0)
提交回复
热议问题