问题
I have a table MYTABLE
with a date column SDATE
which is the primary key of the table and has a unique index on it.
When I run this query:
SELECT MIN(SDATE) FROM MYTABLE
it gives answer instantly. The same happens for:
SELECT MAX(SDATE) FROM MYTABLE
But, if I query both together:
SELECT MIN(SDATE), MAX(SDATE) FROM MYTABLE
it takes much more time to execute. I analyzed the plans and found when one of min or max is queried, it uses INDEX FULL SCAN(MIN/MAX) but when both are queried at the same time, it does a FULL TABLE SCAN.
why?
Test Data:
version 11g
create table MYTABLE
(
SDATE DATE not null,
CELL VARCHAR2(10),
data NUMBER
)
tablespace CHIPS
pctfree 10
pctused 40
initrans 1
maxtrans 255
storage
(
initial 64K
minextents 1
maxextents unlimited
);
alter table MYTABLE
add constraint PK_SDATE primary key (SDATE)
using index
tablespace SYSTEM
pctfree 10
initrans 2
maxtrans 255
storage
(
initial 64K
minextents 1
maxextents unlimited
);
Load table:
declare
i integer;
begin
for i in 0 .. 100000 loop
insert into MYTABLE(sdate, cell, data)
values(sysdate - i/24, 'T' || i, i);
commit;
end loop;
end;
Gather stats:
begin
dbms_stats.gather_table_stats(tabname => 'MYTABLE', ownname => 'SYS');
end;
Plan1:
Plan2:
回答1:
The Index Full Scan can only visit one side of the index. When you are doing
SELECT MIN(SDATE), MAX(SDATE) FROM MYTABLE
you are requesting to visit 2 sides. Therefore, if you want both the minimum and the maximum column value, an Index Full Scan is not viable.
A more detailed analyze you can find here.
回答2:
The explain plans are different: a single MIN
or MAX
will produce a INDEX FULL SCAN (MIN/MAX)
whereas when the two are present you will get an INDEX FULL SCAN
or a FAST FULL INDEX SCAN.
To understand the difference, we have to look for a description of a FULL INDEX SCAN:
In a full index scan, the database reads the entire index in order.
In other words, if the index is on a VARCHAR2
field, Oracle will fetch the first block of the index that would contain for example all entries that start with the letter "A" and will read block by block all entries alphabetically until the last entry ("A" to "Z"). Oracle can process in this way because the entries are sorted in a binary tree index.
When you see INDEX FULL SCAN (MIN/MAX)
in an explain plan, that is the result of an optimization that uses the fact that since the entries are sorted, you can stop after having read the first one if you are only interested by the MIN
. If you are interested in the MAX
only, Oracle can use the same access path but this time starting by the last entry and reading backwards from "Z" to "A".
As of now, a FULL INDEX SCAN
has only one direction (either forward or backward) and can not start from both ends simultaneously, this is why when you ask for both the min and the max, you get a less efficient access method.
As suggested by other answers, if the query needs critical efficiency, you could run your own optimization by searching for the min and the max in two distinct queries.
回答3:
Try not selecting both edges of the index in one query , Accessing the query in a different way like this :
select max_date, min_date
from (select max(sdate) max_date from mytable),
(select min(sdate) min_date from mytable)
will cause the optimizer to access the index in INDEX_FULL_SCAN(MIN/MAX) in nested loops (in our case , twice).
回答4:
I have to say that I do not see the same behaviour in 11.2
If I set up a test case as follows and updated from 10k to 1m rows in response to Vincent's comment
set linesize 130
set pagesize 0
create table mytable ( sdate date );
Table created.
insert into mytable
select sysdate - level
from dual
connect by level <= 1000000;
commit;
1000000 rows created.
Commit complete.
alter table mytable add constraint pk_mytable primary key ( sdate ) using index;
Table altered.
begin
dbms_stats.gather_table_stats( user, 'MYTABLE'
, estimate_percent => 100
, cascade => true
);
end;
/
PL/SQL procedure successfully completed.
Then, executing your queries I get almost identical looking explain plans (notice the different types of INDEX FULL SCAN)
explain plan for select min(sdate) from mytable;
Explained.
select * from table(dbms_xplan.display);
Plan hash value: 3877058912
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
| 2 | INDEX FULL SCAN (MIN/MAX)| PK_MYTABLE | 1 | 8 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
9 rows selected.
explain plan for select min(sdate), max(sdate) from mytable;
Explained.
select * from table(dbms_xplan.display);
Plan hash value: 3812733167
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 252 (0)| 00:00:04 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
| 2 | INDEX FULL SCAN| PK_MYTABLE | 1000K| 7812K| 252 (0)| 00:00:04 |
-------------------------------------------------------------------------------
9 rows selected.
To quote from a previous answer of mine:
The two most common reasons for a query not using indexes are:
- It's quicker to do a full table scan.
- Poor statistics.
Unless there's something you're not posting in the question my immediate answer would be that you have not collected statistics on this table, you haven't collected them with a high enough estimate percent or you've used analyze, which will not help the Cost Based Optimizer, unlike dbms_stats.gather_table_stats.
To quote from the documentation on analyze
:
For the collection of most statistics, use the DBMS_STATS package, which lets you collect statistics in parallel, collect global statistics for partitioned objects, and fine tune your statistics collection in other ways. See Oracle Database PL/SQL Packages and Types Reference for more information on the DBMS_STATS package.
Use the ANALYZE statement (rather than DBMS_STATS) for statistics collection not related to the cost-based optimizer:
来源:https://stackoverflow.com/questions/12565790/selecting-both-min-and-max-from-the-table-is-slower-than-expected