当where子句后指定了一个表的多个列条件时,优化器通常会将多个列的选择性(selectivity)相乘得到where语句的选择性,导致优化器做出错误判断!
Oracle 11g引入了多列统计信息概念,如果上面情况列关联性很好,我们可以做多列统计信息收集,让优化器做出正确判断。
例如:
SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA';
COUNT(*)
----------
3341
SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;
COUNT(*)
----------
3341
SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52775;
COUNT(*)
----------
0
可以看出表customers的列CUST_STATE_PROVINCE、country_id是有关系的,只有多列统计信息才能让优化器知道他们的关系,从而有一个更精确的选择性(selectiveity)。
1、创建Column Group:
DECLARE
cg_name varchar2(30);
BEGIN
cg_name := dbms_stats.create_extended_stats(null,'customers',
'(cust_state_province,country_id)');
END;
2、查看column group name:
select sys.dbms_stats.show_extended_stats_name('sh','customers',
'(cust_state_province,country_id)') col_group_name
from dual;
COL_GROUP_NAME
--------------------------------------------------------------------------------
SYS_STU#S#WF25Z#QAHIHE#MOFFMM_
3、删除Column Group
exec dbms_stats.drop_extended_stats('sh','customers','(cust_state_province,country_id)');
4、监控Column Groups
--查询多列统计信息
Select extension_name, extension
from user_stat_extensions
where table_name='CUSTOMERS';
EXTENSION_NAME EXTENSION
------------------------------ ----------------------------------------
SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ ("CUST_STATE_PROVINCE","COUNTRY_ID")
--查看distinct数和柱状图使用情况
select e.extension col_group, t.num_distinct, t.histogram
from user_stat_extensions e, user_tab_col_statistics t
where e.extension_name = t.column_name
and e.table_name = t.table_name
and t.table_name = 'CUSTOMERS';
COL_GROUP NUM_DISTINCT HISTOGRAM
---------------------------------------- ------------ ---------------
("CUST_STATE_PROVINCE","COUNTRY_ID") 145 FREQUENCY
5、收集多列统计信息
1)收集已存在的列组统计信息
EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE AUTO');
2)收集新指定的列组统计信息
EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');
实验:
1)当不使用多列统计信息时,真实结果是3341,执行计划是1107.
SQL> exec dbms_stats.drop_extended_stats('sh','customers','(cust_state_province,country_id)');
SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;
COUNT(*)
----------
3341
Execution Plan
----------------------------------------------------------
Plan hash value: 296924608
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 406 (1)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 16 | | |
|* 2 | TABLE ACCESS FULL| CUSTOMERS | 1107 | 17712 | 406 (1)| 00:00:05 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)
Statistics
----------------------------------------------------------
120 recursive calls
0 db block gets
1588 consistent gets
1454 physical reads
0 redo size
423 bytes sent via SQL*Net to client
420 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
1 rows processed
2)当使用多列统计信息时,真实结果是3341,执行计划是3294.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');
SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;
COUNT(*)
----------
3341
Execution Plan
----------------------------------------------------------
Plan hash value: 296924608
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 406 (1)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 16 | | |
|* 2 | TABLE ACCESS FULL| CUSTOMERS | 3294 | 52704 | 406 (1)| 00:00:05 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)
Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
1458 consistent gets
1454 physical reads
0 redo size
423 bytes sent via SQL*Net to client
420 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
3)即以上情况,使用多列统计信息能让优化器得到更准确的判断!
来源:CSDN
作者:执笔画情ora
链接:https://blog.csdn.net/oradbm/article/details/103882116