How do NULL values affect performance in a database search?

前端 未结 8 708
死守一世寂寞
死守一世寂寞 2020-12-02 13:13

In our product we have a generic search engine, and trying to optimze the search performance. A lot of the tables used in the queries allow null values. Should we redesign o

相关标签:
8条回答
  • 2020-12-02 14:06

    An extra answer to draw some extra attention to David Aldridge's comment on Quassnoi's accepted answer.

    The statement:

    this query:

    SELECT * FROM table WHERE column IS NULL

    will always use full table scan

    is not true. Here is the counter example using an index with a literal value:

    SQL> create table mytable (mycolumn)
      2  as
      3   select nullif(level,10000)
      4     from dual
      5  connect by level <= 10000
      6  /
    
    Table created.
    
    SQL> create index i1 on mytable(mycolumn,1)
      2  /
    
    Index created.
    
    SQL> exec dbms_stats.gather_table_stats(user,'mytable',cascade=>true)
    
    PL/SQL procedure successfully completed.
    
    SQL> set serveroutput off
    SQL> select /*+ gather_plan_statistics */ *
      2    from mytable
      3   where mycolumn is null
      4  /
    
      MYCOLUMN
    ----------
    
    
    1 row selected.
    
    SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
      2  /
    
    PLAN_TABLE_OUTPUT
    -----------------------------------------------------------------------------------------
    SQL_ID  daxdqjwaww1gr, child number 0
    -------------------------------------
    select /*+ gather_plan_statistics */ *   from mytable  where mycolumn
    is null
    
    Plan hash value: 1816312439
    
    -----------------------------------------------------------------------------------
    | Id  | Operation        | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    -----------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT |      |      1 |        |      1 |00:00:00.01 |       2 |
    |*  1 |  INDEX RANGE SCAN| I1   |      1 |      1 |      1 |00:00:00.01 |       2 |
    -----------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - access("MYCOLUMN" IS NULL)
    
    
    19 rows selected.
    

    As you can see, the index is being used.

    Regards, Rob.

    0 讨论(0)
  • 2020-12-02 14:07

    Nullable fields can have a big impact on performance when doing "NOT IN" queries. Because rows with all indexed fields set to null aren't indexed in a B-Tree indexes, Oracle must do a full table scan to check for null entires, even when a index exists.

    For example:

    create table t1 as select rownum rn from all_objects;
    
    create table t2 as select rownum rn from all_objects;
    
    create unique index t1_idx on t1(rn);
    
    create unique index t2_idx on t2(rn);
    
    delete from t2 where rn = 3;
    
    explain plan for
    select *
      from t1
     where rn not in ( select rn
                         from t2 );
    
    ---------------------------------------------------------------------------
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |      | 50173 |   636K|  3162   (1)| 00:00:38 |
    |*  1 |  FILTER            |      |       |       |            |          |
    |   2 |   TABLE ACCESS FULL| T1   | 50205 |   637K|    24   (5)| 00:00:01 |
    |*  3 |   TABLE ACCESS FULL| T2   | 45404 |   576K|     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------
    

    The query has to check for null values so it has to do a full table scan of t2 for each row in t1.

    Now, if we make the fields not nullable, it can use the index.

    alter table t1 modify rn not null;
    
    alter table t2 modify rn not null;
    
    explain plan for
    select *
      from t1
     where rn not in ( select rn
                         from t2 );
    
    -----------------------------------------------------------------------------
    | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |        |  2412 | 62712 |    24   (9)| 00:00:01 |
    |   1 |  NESTED LOOPS ANTI |        |  2412 | 62712 |    24   (9)| 00:00:01 |
    |   2 |   INDEX FULL SCAN  | T1_IDX | 50205 |   637K|    21   (0)| 00:00:01 |
    |*  3 |   INDEX UNIQUE SCAN| T2_IDX | 45498 |   577K|     1   (0)| 00:00:01 |
    -----------------------------------------------------------------------------
    
    0 讨论(0)
提交回复
热议问题