We recently migrated a customer system to Oracle 12c and to the latest version of our product. This process includes running a number of migration scripts which mostly add or ch
In Oracle release 11g Oracle has introdused a new optimization technique for improving the performance of DDL operations. This new feature allows extremely rapid execution time when adding a NOT NULL column with default value to an existing table. Since release 12c the DDL optimization has been extended to include NULL columns having default value.
Consider following test table with 1.000.000 rows:
sql> create table xxy
as select rownum a from dual connect by level <= 1e6
;
sql> select /*+ gather_plan_statistics */ count(1) from xxy;
sql> select * from table(dbms_xplan.display_cursor);
Now we're going to add an extra not null column having a default value in different sessions for 11g and 12c:
11g> alter table xxy add b number default 1;
--Table XXY altered. Elapsed: 00:01:00.998
12c> alter table xxy add b number default 1;
--Table XXY altered. Elapsed: 00:00:00.052
Notice the difference in the execution time: 1M rows updated in 5 ms !?
Execution plan shows:
11g> select count(1) from xxy where b = 1;
COUNT(1)
----------
1000000
11g> select * from table(dbms_xplan.display_cursor);
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1040 (100)| |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | TABLE ACCESS FULL| XXY | 898K| 11M| 1040 (1)| 00:00:13 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("B"=1)
Note
-----
- dynamic sampling used for this statement (level=2)
12c> select count(1) from xxy where b = 1;
12c> select * from table(dbms_xplan.display_cursor);
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 429 (100)| |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| XXY | 1000K| 4882K| 429 (2)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(DECODE(TO_CHAR(SYS_OP_VECBIT("SYS_NC00002$",0)),NULL,NVL("
B",1),'0',NVL("B",1),'1',"B")=1)
Note
-----
- statistics feedback used for this statement
The execution plan on 12c shows by contrast with 11g a complex predicate part involving a new internal column SYS_NC00006$
.
This predicate indicates that, internally, Oracle is still considering the B column to be potentially able to contain non default values. It means - Oracle at first do not physically updates each row with the default value.
Why a new internal column SYS_NC00006$
is created?
12c> select column_name, virtual_column, hidden_column, user_generated
from user_tab_cols
where table_name = 'XXY'
;
COLUMN_NAME VIR HID USE
---------------- --- --- ---
B NO NO YES
SYS_NC00002$ NO YES NO
A NO NO YES
12c> select a, b, SYS_NC00002$ hid from xxy where a in (1,10);
A B HID
---------- ---------- ----------------
1 1
10 1
12c> update xxy set b=1 where a=10 and b=1;
1 row updated.
12c> select a, b, SYS_NC00002$ hid from xxy where a in (1,10);
A B HID
---------- ---------- ----------------
1 1
10 1 01
Notice the difference in the values of B and related internal columns. Oracle is simply checking through its system generated internal column (e.g. SYS_NC00006$
) and via the SYS_OP_VECBIT
function whether to consider the default value of the B column or the real value modiefed via an explicit DML statement.
What is with two separate alter statements?
12c> alter table xxy add (b integer);
12c> alter table xxy modify b default 1;
12c> select count(b), count(coalesce(b,0)) nulls from xxy where b = 1 or b is null;
COUNT(B) NULLS
---------- ----------
0 1000000
The value of new column remains NULL for all rows. No real updates are needed therefore the DDL statement will be not optimized.
Here is an OTN article that explains the new DDL optimization in more detail.