I\'m using an indexed column used as a filter by putting it \'between\' two literal values. (The column is in the second position of the index and actually makes execution slowe
Bugs in the optimizer or the parser cause some date formats to downgrade static partition pruning to dynamic partition pruning. Partition pruning changes lead to different cardinality and costs, which then lead to significant changes to many other parts of the plan.
This answer only partially explains the problem and contains some speculation. Hopefully it will at least shed a little light on what the problem is and is not. It is at least a good starting point if you really need a complete explanation and want to submit a service request to Oracle.
Terminology and some background reading
Static partition pruning is when the optimizer determines at compile time which partition will be used. Statistics are per-partition, leading to
better cardinality estimates, leading to better plans. For example, think of a table partitioned by status where the partition for CANCELLED
is tiny and the partition for ACTIVE is large. Knowing which partition is used can completely change the join order and access methods of the optimal plan.
Pstart
and Pstop
will be numeric values when static partition pruning is used.
Dynamic partition pruning is when the optimizer cannot determine the partition until runtime. Data is only retrieved from the required partitions but the
execution plan is built without special knowledge of which partition is used. Some partition statistic estimates will be a simple average of all available partitions.
In the above example of a table partitioned by status, the average of a tiny partition and a large partition does not accurately represent either. Either Pstart
or
Pstop
will include the word KEY
when dynamic partition pruning is used.
The Oracle® Database VLDB and Partitioning Guide includes a section about Datatype Conversions that is worth reading. For example, one relevant quote from the manual:
Only a properly applied TO_DATE function guarantees that the database is capable of uniquely determining the date value and using it potentially for static pruning, which is especially beneficial for single partition access.
Sample schema and data
This simple test case demonstrates the problem. It also rules out common performance issues, such as missing statistics.
First, create a sample table with 2 partitions, one large and one small.
create table gprs_history_import(id number, start_call_date_time date)
partition by range (start_call_date_time)
(
partition p_large values less than (date '2014-06-01'),
partition p_small values less than (date '2014-07-01')
);
insert into gprs_history_import
select level, date '2014-05-01'
from dual connect by level <= 1000;
insert into gprs_history_import
select level, date '2014-06-01'
from dual connect by level <= 10;
begin
dbms_stats.gather_table_stats(user, 'GPRS_HISTORY_IMPORT');
end;
/
select count(*) from gprs_history_import partition (p_large); -- 1000
select count(*) from gprs_history_import partition (p_small); -- 10
Static to dynamic causes bad cardinality estimates
The static cardinality estimate is a perfect 1000. The extra space in second date format changes Pstop
from 1 to KEY
. The plan changes from static to dynamic partition pruning. The dynamic estimate is an inaccurate 505, the average of 1000 and 10
For simplicity this example only shows a bad cardinality estimate. It is not necessary to show a query running slow, since bad row estimates inevitably lead to bad execution plans for many reasons.
explain plan for select /* static partition pruning */ *
from gprs_history_import
where start_call_date_time < to_date('20140601 000000','yyyymmdd hh24miss');
select * from table(dbms_xplan.display);
Plan hash value: 452971246
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 12000 | 16 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1000 | 12000 | 16 (0)| 00:00:01 | 1 | 1 |
| 2 | TABLE ACCESS FULL | GPRS_HISTORY_IMPORT | 1000 | 12000 | 16 (0)| 00:00:01 | 1 | 1 |
--------------------------------------------------------------------------------------------------------------
explain plan for select /* dybnamic partition pruning */ *
from gprs_history_import
where start_call_date_time < to_date('20140601 ','yyyymmdd ');
select * from table(dbms_xplan.display);
Plan hash value: 2464174375
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 505 | 6060 | 29 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE ITERATOR| | 505 | 6060 | 29 (0)| 00:00:01 | 1 | KEY |
|* 2 | TABLE ACCESS FULL | GPRS_HISTORY_IMPORT | 505 | 6060 | 29 (0)| 00:00:01 | 1 | KEY |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("START_CALL_DATE_TIME"<TO_DATE('20140601 ','yyyymmdd '))
Date format parsing problems
Now for some speculation as to why the query moves from static to dynamic partition pruning.
It is not always obvious when the optimizer can use static and dynamic partitioning. In general, literals allow static pruning and variables require dynamic pruning.
--#1: Obviously static: It uses an unambiguous ANSI date literal.
select * from gprs_history_import where start_call_date_time = date '2000-11-01';
--#2: Obviously dyanmic: It uses a bind variable.
select * from gprs_history_import where start_call_date_time = :date;
--#3: Probably dynamic: The optimizer cannot always infer the literal value.
select * from gprs_history_import where start_call_date_time =
(select date '2000-11-01' from dual);
--#4: Probably static: FEB is not always valid, but Oracle can figure that out.
select * from gprs_history_import where start_call_date_time =
to_date('01-FEB-2000', 'DD-MON-YYYY');
When you consider all the performance and internationalization issues around case #4 it becomes clear how difficult it is to parse dates. The value of to_date('01-FEB-2000', 'DD-MON-YYYY')
depends on several NLS parameters, such as NLS_DATE_LANGUAGE
. The query is valid for English but not German. And if NLS_CALENDAR
is not set to GREGORIAN
than even an all-numeric date format can be wrong. The to_date
string is not a bind value but it's not clearly a literal either.
The difference between true date literals and formatted strings is more apparent if hard parses are counted. Query #1 will not force a hard-parse even if the language is changed, but query #4 will. This can be demonstrated by running several variations of each, changing the language, and then running select value from v$sesstat natural join v$statname where name = 'parse count (hard)' and sid = userenv('SID');
.
Oracle must have a variable somewhere to denote "this is not a bind variable but could lead to different plans based on NLS settings". That variable does not always lead to dynamic partition pruning but there must be some bugs somewhere that occasionally break it.
Ok - I'll give it a try, this is mostly deduction from the availabe Information:
Why does Oracle choose a different execution plan?
It seems in your second query with the unusual Date-Format, the optimizer has no idea what the value of the resulting date is. You see the Filter Predicate:
1 - filter(TO_DATE('20140610 ','yyyymmdd ')<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
Which means the optimizer is not even sure that the first date is smaller than the second! That means the optimizer has no idea about the number of returned rows and will just use a generic plan without taking specific statistics into account. It would be the same if you had a user-defined function xyt() which would return a date for the range. The optimizer has no way to know what date-value will result - This means you get a generall all purpose plan, which should be pretty decent for any date-range specified.
In the first and third case, the optimizer seems to understand the date directly and can guess the number of rows which are in the date range by using statistics. So while the second Query was to the Optimizer like BETWEEN X AND 3
this Query is like BETWEEN 1 AND 3
So he optimizes the query plan for the predicted number of returned rows!
The Strange thing seems to be, that the query optimizer has such problems with a strange date format, could be filed as a bug/request for improvement...
But an important Point:
Basically if you return a high number of rows from a table a full table scan without index access will in many cases be much faster, especially when operating on certain partitions! - The Table scan will only access the pertition for the matching date range - so only for the date in question and returns all rows from this partition. This is much faster than queriyng the index for each single row and then extracting the row by index access... Try to profile the querys - the full table scan on partition should be 3 times as fast with much less IO