Is an anti-join more efficient than a left outer join?

前端 未结 2 1556
既然无缘
既然无缘 2021-01-14 01:38

A comment on this answer notes that anti-joins may have been optimized to be more efficient that outer joins in Oracle. I\'d be interested to see what explanations/evidence

相关标签:
2条回答
  • 2021-01-14 02:05

    If Oracle can transform left join + where is null into ANTI join then it's exactly the same.

    create table ttt1 as select mod(rownum,10) id from dual connect by level <= 50000;
    insert into ttt1 select 10 from dual;
    create table ttt2 as select mod(rownum,10) id from dual connect by level <= 50000;
    
    select ttt1.id
      from ttt1
      left join ttt2
        on ttt1.id = ttt2.id
     where ttt2.id is null;
    
    select * from ttt1 where id not in (select id from ttt2);
    

    If you have a look at

    Final query after transformations:******* UNPARSED QUERY IS *******
    

    in trace for event 10053 then you'll find two exactly the same queries (you can see "=" in predicate in the trace file because there is no special sign for ANTI join)

    SELECT "TTT1"."ID" "ID" FROM "TTT2" "TTT2","TTT1" "TTT1" WHERE "TTT1"."ID"="TTT2"."ID"
    

    And they have exactly the same plans

    -----------------------------------
    | Id  | Operation          | Name |
    -----------------------------------
    |   0 | SELECT STATEMENT   |      |
    |   1 |  HASH JOIN ANTI    |      |
    |   2 |   TABLE ACCESS FULL| TTT1 |
    |   3 |   TABLE ACCESS FULL| TTT2 |
    -----------------------------------
    

    If you, however, put a hint to disable transformations then plan will be

    select --+ no_query_transformation
           ttt1.id
      from ttt1, ttt2
     where ttt1.id = ttt2.id(+) and ttt2.id is null;
    
    ------------------------------------
    | Id  | Operation           | Name |
    ------------------------------------
    |   0 | SELECT STATEMENT    |      |
    |   1 |  FILTER             |      |
    |   2 |   HASH JOIN OUTER   |      |
    |   3 |    TABLE ACCESS FULL| TTT1 |
    |   4 |    TABLE ACCESS FULL| TTT2 |
    ------------------------------------
    

    and performance will degrade significantly.

    If you will use ANSI join syntax with deisabled transformation it will be even worse.

    select --+ no_query_transformation
           ttt1.id
      from ttt1
      left join ttt2
        on ttt1.id = ttt2.id
     where ttt2.id is null;
    select * from table(dbms_xplan.display_cursor(format => 'BASIC'));
    
    --------------------------------------------------
    | Id  | Operation              | Name            |
    --------------------------------------------------
    |   0 | SELECT STATEMENT       |                 |
    |   1 |  VIEW                  |                 |
    |   2 |   FILTER               |                 |
    |   3 |    MERGE JOIN OUTER    |                 |
    |   4 |     TABLE ACCESS FULL  | TTT1            |
    |   5 |     BUFFER SORT        |                 |
    |   6 |      VIEW              | VW_LAT_2131DCCF |
    |   7 |       TABLE ACCESS FULL| TTT2            |
    --------------------------------------------------
    

    So, in a nutshell, if Oracle can apply transformation to ANTI join then performance is exactly the same otherwise it can be worse. You can also use hint "--+ rule" to disable CBO transformations and see what happenes.

    PS. On a separate note, SEMI join may be in some cases much better than inner join + distinct even with enabled CBO transformations.

    0 讨论(0)
  • 2021-01-14 02:16

    When you use "not exists" or "not in" in your SQL query, you let Oracle to choose merge anti-join or hash anti-join access paths.

    Quick Explanation

    For example, given join betwen table A and B (from A join B on A.x = B.x) Oracle will fetch all relevant data from table A, and try to match them with corresponding rows in table B, so it's strictly dependent on selectivity of table A predicate.

    When using anti-join optimization, Oracle can choose the table with higher selectivity and match it with the other one, which may result in much faster code.

    It can't do that with regular join or subquery, because it can't assume that one match between tables A and B is enough to return that row.

    Related hints: HASH_AJ, MERGE_AJ.

    More:

    This looks like a nice and detailed article on the subject.

    Here is another, more dencet article.

    0 讨论(0)
提交回复
热议问题