众所周知,Table是分区的时候,analyze根据所有partition上的已有的统计信息“计算”出整个表级别上的统计信息;而dbms_stats是实际去计算整个表范围的统计信息,因此表级别的统计信息比analyze更精确,反映表上真实的情况.
Oracle公司也因此建议使用.
Do not use the COMPUTE and ESTIMATE clauses of ANALYZE to collect optimizer statistics. These clauses are supported for backward compatibility. Instead, use the DBMS_STATS package, which lets you collect statistics in parallel, collect global statistics for partitioned objects, and fine tune your statistics collection in other ways. The optimizer, which depends upon statistics, will eventually use only statistics that have been collected by DBMS_STATS.
对于分区表,更精确的统计信息是否意味着更高的分析成本哪,cpu time,logical reads?在收集表级别的统计信息时候,会不会再对分区表进行全表扫描;又或者有什么高超算法累计出表的统计信息而不需要全部分区的重复扫描?
通过测试,在分析分区表的统计信息的时候,dbms_stats明显比analyze的成本更高,需要再次重复读取所有的分区段。可以从10046 trace event中看到对应SQL语句。
partitions 4 as select * from dba_objects;
Table created.
SQL> begin
2 for i in 1..10 loop
3 insert /*+ append */ into test select * from test;
4 commit;
5 end loop;
6 end;
7 /
begin
*
ERROR at line 1:
ORA-01688: unable to extend table BINZHANG.TEST partition SYS_P5 by 128 in
tablespace USERS
ORA-06512: at line 3
SQL> conn "BINZHANG"/"BINZHANG"
Connected.
SQL> select * from v$mystat where STATISTIC#=9;
SID STATISTIC# VALUE
---------- ---------- ----------
320 9 23
SQL> analyze table test compute statistics;
Table analyzed.
SQL> select * from v$mystat where STATISTIC#=9;
SID STATISTIC# VALUE
---------- ---------- ----------
320 9 8486
SQL> analyze table test delete statistics;
Table analyzed.
SQL> conn "BINZHANG"/"BINZHANG"
Connected.
SQL> select * from v$mystat where STATISTIC#=9;
SID STATISTIC# VALUE
---------- ---------- ----------
320 9 21
SQL> exec dbms_stats.gather_table_stats(user,'test',cascade=>true);
PL/SQL procedure successfully completed.
SQL> select * from v$mystat where STATISTIC#=9;
SID STATISTIC# VALUE
---------- ---------- ----------
320 9 40084
可以看到,dbms_stats需要更多的逻辑读,更多的资源和时间。当从analyze转到dbms_stats的时候,就有可能因为分析花费更长的时间而出现ora-1555
10046 sql trace显示出SQL需要对所有的分区进行扫描来计算表级别的统计信息。
注意KEY是1 - 4
select /*+ no_parallel(t) no_parallel_index(t) dbms_stats
cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
*/ count(*),count("SUBOBJECT_NAME"),count(distinct "SUBOBJECT_NAME"),
sum(sys_op_opnsize("SUBOBJECT_NAME")),substrb(dump(min("SUBOBJECT_NAME"),16,
0,32),1,120),substrb(dump(max("SUBOBJECT_NAME"),16,0,32),1,120),
count("OBJECT_ID"),count(distinct "OBJECT_ID"),count("DATA_OBJECT_ID"),
count(distinct "DATA_OBJECT_ID")
from
"BINZHANG"."TEST" sample ( 9.7638523499) t
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.74 0.87 5666 7755 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.75 0.87 5666 7755 0 1
Misses in library cache during parse: 1
Optimizer mode: RULE
Parsing user id: 26 (recursive depth: 1)
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT GROUP BY (cr=7755 pr=5666 pw=0 time=870025 us)
60727 PARTITION HASH ALL PARTITION: 1 4 (cr=7755 pr=5666 pw=0 time=485977 us)
60727 TABLE ACCESS SAMPLE TEST PARTITION: 1 4 (cr=7755 pr=5666 pw=0 time=486239 us)