oracle统计信息-11新特性多列统计信息

好久不见. 提交于 2020-01-28 03:54:12

 当where子句后指定了一个表的多个列条件时,优化器通常会将多个列的选择性(selectivity)相乘得到where语句的选择性,导致优化器做出错误判断!

Oracle 11g引入了多列统计信息概念,如果上面情况列关联性很好,我们可以做多列统计信息收集,让优化器做出正确判断。

例如:

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA';

COUNT(*)

----------

      3341

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;

 COUNT(*)

----------

      3341

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52775;

 COUNT(*)

----------

          0

可以看出表customers的列CUST_STATE_PROVINCE、country_id是有关系的,只有多列统计信息才能让优化器知道他们的关系,从而有一个更精确的选择性(selectiveity)。

1、创建Column Group

DECLARE

  cg_name varchar2(30);

BEGIN

  cg_name := dbms_stats.create_extended_stats(null,'customers', 

             '(cust_state_province,country_id)');

END;

2、查看column group name:

select sys.dbms_stats.show_extended_stats_name('sh','customers',

       '(cust_state_province,country_id)') col_group_name

from dual;

COL_GROUP_NAME

--------------------------------------------------------------------------------

SYS_STU#S#WF25Z#QAHIHE#MOFFMM_

3、删除Column Group

exec dbms_stats.drop_extended_stats('sh','customers','(cust_state_province,country_id)');

4、监控Column Groups

 

--查询多列统计信息

Select extension_name, extension

from user_stat_extensions

where table_name='CUSTOMERS';

EXTENSION_NAME                EXTENSION

------------------------------ ----------------------------------------

SYS_STU#S#WF25Z#QAHIHE#MOFFMM_   ("CUST_STATE_PROVINCE","COUNTRY_ID")

--查看distinct数和柱状图使用情况

select e.extension col_group, t.num_distinct, t.histogram

  from user_stat_extensions e, user_tab_col_statistics t

 where e.extension_name = t.column_name

   and e.table_name = t.table_name

   and t.table_name = 'CUSTOMERS';

COL_GROUP                              NUM_DISTINCT  HISTOGRAM

---------------------------------------- ------------ ---------------

("CUST_STATE_PROVINCE","COUNTRY_ID")           145  FREQUENCY

5、收集多列统计信息

1)收集已存在的列组统计信息

EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE AUTO');

2)收集新指定的列组统计信息

EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');

实验:

1)当不使用多列统计信息时,真实结果是3341,执行计划是1107.

SQL> exec dbms_stats.drop_extended_stats('sh','customers','(cust_state_province,country_id)');

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;

COUNT(*)

----------

      3341

 

Execution Plan

----------------------------------------------------------

Plan hash value: 296924608

--------------------------------------------------------------------------------

| Id  | Operation      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |        |     1 |    16 |   406      (1)| 00:00:05 |

|   1 |  SORT AGGREGATE    |               |     1 |    16 |      |              |

|*  2 |   TABLE ACCESS FULL| CUSTOMERS |  1107 | 17712 |   406 (1)| 00:00:05 |

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)

Statistics

----------------------------------------------------------

         120  recursive calls

           0  db block gets

       1588  consistent gets

       1454  physical reads

           0  redo size

         423  bytes sent via SQL*Net to client

         420  bytes received via SQL*Net from client

           2  SQL*Net roundtrips to/from client

           7  sorts (memory)

           0  sorts (disk)

           1  rows processed

2)当使用多列统计信息时,真实结果是3341,执行计划是3294.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');

SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;

COUNT(*)

----------

      3341

Execution Plan

----------------------------------------------------------

Plan hash value: 296924608

--------------------------------------------------------------------------------

| Id  | Operation      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |        |     1 |    16 |   406      (1)| 00:00:05 |

|   1 |  SORT AGGREGATE    |               |     1 |    16 |      |              |

|*  2 |   TABLE ACCESS FULL| CUSTOMERS |  3294 | 52704 |   406        (1)| 00:00:05 |

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)

Statistics

----------------------------------------------------------

           8  recursive calls

           0  db block gets

       1458  consistent gets

       1454  physical reads

           0  redo size

         423  bytes sent via SQL*Net to client

         420  bytes received via SQL*Net from client

           2  SQL*Net roundtrips to/from client

           0  sorts (memory)

           0  sorts (disk)

           1  rows processed

3)即以上情况,使用多列统计信息能让优化器得到更准确的判断!

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!