Chinaunix首页 | 论坛 | 博客
  • 博客访问: 225359
  • 博文数量: 119
  • 博客积分: 10
  • 博客等级: 民兵
  • 技术积分: 1261
  • 用 户 组: 普通用户
  • 注册时间: 2012-05-23 17:50
文章分类

全部博文(119)

文章存档

2016年(8)

2015年(78)

2014年(33)

我的朋友

分类: 数据库开发技术

2015-02-11 16:50:23

netezza command manual: nz_genstats


当表没有记录的时候,Statistics Status  也会显示Unavailable.
[nz@hlixau1105 log]$ nz_genstats -full  db_name test_t1

Database:  db_name
# Tables: 1

    #  Table Name                                   Table Rowcount  Current Operation                  Seconds
=====  ========================================  =================  =================================  =======
    1  TEST_T1                                                   1  GENERATE /*full*/  STATISTICS ...  1

[nz@hlixau1105 log]$ nz_get  db_name test_t1
___________________________________________________________________________________________________________________________________________

Database:  db_name  (220758)


   Table: TEST_T1  (10715790)
Rowcount: 0
Dist Key: ID

attnum Column Name              Statistics Status   Minimum Value   Maximum Value    # of Unique Values       # of NULLs      MaxLen AvgLen
------ ------------------------ ==================  =============== =============== ==================== ==================== ====== ======
    1  ID                       Unavailable                                                                                 
    2  ID2                      Unavailable                                                                                 

                                                                               
GENERATE STATISTICS ON

; 等于 GENERATE EXPRESS STATISTICS ON
;  都是 express 的方式收集 ,
GENERATE /*full*/  STATISTICS   类似 HINT
GENERATE /*basic*/ STATISTICS

nz_genstats -full 可以做到full 收集                
[nz@hlixau1105 log]$ nz_genstats -full  db_name test_t1

Database:  db_name
# Tables: 1

    #  Table Name                                   Table Rowcount  Current Operation                  Seconds
=====  ========================================  =================  =================================  =======
    1  TEST_T1                                                   2  GENERATE /*full*/  STATISTICS ...  1

[nz@hlixau1105 log]$ nz_get  db_name test_t1
___________________________________________________________________________________________________________________________________________

Database:  db_name  (220758)


   Table: TEST_T1  (10715790)
Rowcount: 2
Dist Key: ID

attnum Column Name              Statistics Status   Minimum Value   Maximum Value    # of Unique Values       # of NULLs      MaxLen AvgLen
------ ------------------------ ==================  =============== =============== ==================== ==================== ====== ======
    1  ID                       Full                1               2                                  2                    
    2  ID2                      Full                1               3                                  2                    

 


Full statistics take the longest to generate -- but provide more accurate dispersion information (the # of unique values) for each column.

Express statistics use a faster (but less accurate) mathematical formula to determine the dispersion.

Basic statistics can be generated the fastest -- because they skip the dispersion calculation entirely.  (This is new as of NPS Version 4.6).

### -min ... -max 指定表记录在 between -min  and  -max 之间的表才会收集
 [-min ]     Optional  switches     (default is -min 0)
 [-max ]                            (default is -max 99999999999999999)

These switches allow you to control which tables are to have a GENSTATS run against them ... based on their table rowcount being between the -min
and -max values specifed.  The default is for all tables to be included.
 Example: Specifying  '-max 1000000'  will result in only tables with <= 1M rows having a GENSTATS performed against them (and then only if
 necessary ... if the statistics aren't already up-to-date).
   
   
***  note :
1)  Full statistics are recommended for smaller dimension tables ... which usually have fewer rows ... where the accuracy of the dispersion value
    tends to have more of an impact on query planning.

2)  For larger fact tables, express statistics or basic statistics are typically used because of the time savings in generating the statistics. 
 
3)  When GENSTATS are run against a table, the zonemaps for that table are rebuilt.

4)  INSERT/UPDATE/DELETE queries against the table are blocked while GENSTATs is being run.  Conversely, GENSTATS will block and not start
 if there is any outstanding INS/UPD/DEL operation currently being performaned against the table.
 
###万一Netezza的优化器判断错误了呢 ? 
If statistics are already up-to-date for a given table, this script won't bother to re-generate the statistics 
 
 
 
 

####脚本中主要通过如下sql 来决定采用何种方式收集统计信息
* If not specified, then this script will adhere to whatever type of stats each individual table currently has associated with it.
如果没有指定 -full|-express|-basic ,那么NETEZZA 会坚持以这张表最近的收集类型来决定

select
        -- We don't maintain statistics for 'big text columns'.
        -- So there shouldn't even be a statistics row for such a column
        -- Thus, no need to check for nulls
        tablename ,columnname,RECENT,
        CASE
            WHEN RECENT::smallint =    85  THEN 'FULL'        -- Full
            WHEN RECENT::smallint =     1  THEN 'FULL'        -- Full Min/Max OK
            WHEN RECENT::smallint =     0  THEN 'FULL'        -- Full Outdated

            WHEN RECENT::smallint =   341  THEN 'EXPRESS'     -- Express
            WHEN RECENT::smallint =   257  THEN 'EXPRESS'     -- Express Min/Max OK
            WHEN RECENT::smallint =   256  THEN 'EXPRESS'     -- Express Outdated

            WHEN RECENT::smallint =   149  THEN 'BASIC'       -- Basic
            WHEN RECENT::smallint =   129  THEN 'BASIC'       -- Basic Min/Max OK
            WHEN RECENT::smallint =   128  THEN 'BASIC'       -- Basic Outdated

            WHEN RECENT::smallint =   170  THEN 'UNAVAILABLE'
            WHEN RECENT::smallint = 16554  THEN 'UNAVAILABLE'

            WHEN RECENT::smallint =   169  THEN 'MINMAXONLY'

            ELSE                                'UNKNOWN'
        END as stats_type,
        CASE
            WHEN RECENT::smallint in (85,341,149) THEN  'UP TO DATE'
            ELSE                                        'OUTDATED'
        END as stats_state
        from _v_statistic
        where   upper(objtype) in ('TABLE','SECURE TABLE')
  and tablename = 'TEST_T1';


到目前为止还没有发现Netezza如何查看表的上一次收集时间 , 如果想查看,可以从  query history and   /nz/kit/log/postgres/pg.log 入手 .  

 

阅读(2519) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~