1. 什么是统计信息
统计信息主要是描述中
表,索引的大小,规模,数据分布状况等的一类信息。比如,表的行数,块数,平均每行的大小,索引的leaf
blocks,索引字段的行数,不同值的大小等,都属于统计信息。CBO正是根据这些统计信息数据,计算出不同访问路径下,不同join
方式下,各种计划的成本,最后选择出成本最小的计划。
统计信息是存放在数据字段表中的,如tab$。一般我们从数据字段视图中察看统计信息状况,如DBA_TABLES,DBA_INDEXES,DBA_TAB_COL_STATISTICS, DBA_TAB_HISTOGRAMS 等。
列
举下DBA_TABLES,DBA_INDEXES
视图中表示统计信息的一些字段。这些字段只有搜集过统计信息之后才有值,否则是空的。这些字段中last_analyzed
字段表示上次统计信息搜集的时间,大家可以根据这个字段,快速的了解最近一次统计信息搜集的时间。
--dba_tables
NUM_ROWS* NUMBER Number of rows in the table
BLOCKS* NUMBER Number of used data blocks in the table
EMPTY_BLOCKS* NUMBER Number of empty (never used) data blocks in the table
AVG_SPACE* NUMBER Average amount of free space, in bytes, in a data block allocated to the table
CHAIN_CNT*
NUMBER Number of rows in the table that are chained from one data block
to another, or which have migrated to a new block, requiring a link to
preserve the old ROWID
AVG_ROW_LEN* NUMBER Average length of a row in the table in bytes
LAST_ANALYZED DATE Date on which this table was most recently analyzed
--dba_indexes
BLEVEL*
NUMBER B*-Tree level: depth of the index from its root block to its
leaf blocks. A depth of 0 indicates that the root block and leaf block
are the same.
LEAF_BLOCKS* NUMBER Number of leaf blocks in the index
DISTINCT_KEYS*
NUMBER Number of distinct indexed values. For indexes that enforce
UNIQUE and PRIMARY KEY constraints, this value is the same as the
number of rows in the table (USER_TABLES.NUM_ROWS)
AVG_LEAF_BLOCKS_PER_KEY*
NUMBER Average number of leaf blocks in which each distinct value in
the index appears, rounded to the nearest integer. For indexes that
enforce UNIQUE and PRIMARY KEY constraints, this value is always 1.
AVG_DATA_BLOCKS_PER_KEY*
NUMBER Average number of data blocks in the table that are pointed to
by a distinct value in the index rounded to the nearest integer. This
statistic is the average number of data blocks that contain rows that
contain a given value for the indexed columns.
CLUSTERING_FACTOR* NUMBER Indicates the amount of order of the rows in the table based on the values of the index.
•
If the value is near the number of blocks, then the table is very well
ordered. In this case, the index entries in a single leaf block tend to
point to rows in the same data blocks.
• If the value is near the
number of rows, then the table is very randomly ordered. In this case,
it is unlikely that index entries in the same leaf block point to rows
in the same data blocks
NUM_ROWS NUMBER Number of rows in the index
SAMPLE_SIZE NUMBER Size of the sample used to analyze the index
LAST_ANALYZED DATE Date on which this index was most recently analyzed
2. 如何搜集统计信息
统计信息搜集也是有多种方法,推荐大家使用DBMS_STATS 表来进行统计信息搜集及进行一般的统计信息维护。
DBMS-STATS
包,主要提供了搜集,删除,导出,导入,修改统计信息的方法,分别对应于gather系列,delete系列,export
系列,import系列,set系列的子过程。一般可能主要是使用统计信息的搜集,以及导出导入这样的功能。具体来说,主要会使用到如下几个子过程:
GATHER_INDEX_STATS Procedure
Gathers index statistics.
GATHER_TABLE_STATS Procedure
Gathers table and column (and index) statistics.
CREATE_STAT_TABLE Procedure
Creates a table with name stattab in ownname's schema which is capable of holding statistics.
EXPORT_TABLE_STATS Procedure
Retrieves statistics for a particular table and stores them in the user stat table.
EXPORT_SCHEMA_STATS Procedure
Retrieves
statistics for all objects in the schema identified by ownname and
stores them in the user stat table identified by stattab.
IMPORT_INDEX_STATS Procedure
Retrieves statistics for a particular index from the user stat table identified by stattab and stores them in the dictionary.
IMPORT_TABLE_STATS Procedure
Retrieves statistics for a particular table from the user stat table identified by stattab and stores them in the dictionary.
IMPORT_SCHEMA_STATS Procedure
Retrieves
statistics for all objects in the schema identified by ownname from the
user stat table and stores them in the dictionary.
对于统计信息的搜集,谈谈个人的几点理解:
统计信息默认是存放在数据字典表中的,也只有数据字典中的统计信息,才会影响到CBO。
DBMS_STATS 提供的CREATE_STAT_TABLE 过程,只是生成一个用户自定义的特定格式的表,用来存放统计信息罢了,这个表中的统计信息是不会影响到统计信息的。
GATHER 系列过程中,如果指定stattab,statid,statown 参数(也可以不指定),则是搜集的统计信息除了更新到数据字典外,还在statown 用户下的stattab 表中存放一份,标示为 statid;
EXPORT和IMPORT 系列的过程中,stattab,statid,statown
参数不能为空,分别表示把数据字典中的当前统计信息导出到用户自定义的表中,以及把用户表中的统计信息导入到数据字典中,很明显可以看出,这里的导入操作
和上面GATHER 操作会改变统计信息,可能会引起执行执行计划的改变,因此要慎重操作。
每次统计信息搜集前,将旧的统计信息备份起来是很有必要的;特别是保留一份或多份系统在稳定时期的统计信息也是很有必要的。
多长时间搜集一次统计信息,对于统计信息如何备份和保留,搜集统计信息时如何选择合适的采样,并行,直方图设置等都比较重要,需要设计一个较好的统计信息搜集策略。
阅读(888) | 评论(1) | 转发(0) |