Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1789036
  • 博文数量: 335
  • 博客积分: 4690
  • 博客等级: 上校
  • 技术积分: 4341
  • 用 户 组: 普通用户
  • 注册时间: 2010-05-08 21:38
个人简介

无聊之人--除了技术,还是技术,你懂得

文章分类

全部博文(335)

文章存档

2016年(29)

2015年(18)

2014年(7)

2013年(86)

2012年(90)

2011年(105)

分类: DB2/Informix

2012-11-26 15:16:00

advantage of compress data
You can reduce the space required for a table by using data compression.
Compressing the data in a table space can significantly reduce the amount of disk
space that is needed to store data and can help improve buffer pool performance.
With compressed data, you might see some of the following performance benefits,
depending on the SQL work load and the amount of compression:
v Higher buffer pool hit ratios
v Fewer I/Os
v Fewer getpage operations
basic concept of  compress dictionary
When you compress data, bit strings that occur frequently are replaced by shorter
strings. Information about the mapping of bit strings to their replacements is
stored in a compression dictionary. Computer processing is required to compress
data before it is stored and to decompress the data when it is retrieved from
storage. In many cases, using the COMPRESS clause can significantly reduce the
amount of disk space needed to store data, but the compression ratio that you
achieve depends on the characteristics of your data.
step to copreess your data
To compress data:
1. Specify COMPRESS YES in one of the following SQL statements:
v CREATE TABLESPACE
v ALTER TABLESPACE
2. Populate the table space with data by taking one of the following actions:
v Run the LOAD utility with REPLACE, RESUME NO, or RESUME YES (if the
tablespace contains no rows) and without KEEPDICTIONARY.
v Run the REORG utility without KEEPDICTIONARY.
If no compression dictionary already exists, and the amount of data in the
tables space reaches a threshold determined by DB2, a compression dictionary
is created. After the compression dictionary is built, DB2 uses it to compress all
subsequent data added to the table space.
If no compression dictionary already exists, and the amount of data in the
tables space reaches a threshold determined by DB2, a compression dictionary
is created. After the compression dictionary is built, DB2 uses it to compress all
subsequent data added to the table space.
TWO WAYS OF POPULATE COMPRESS DICTIONARY 
If you use the REORG utility to build the compression dictionary, DB2 uses
a sampling technique to build the dictionary. This technique uses the first n
rows from the table space and then continues to sample rows for the
remainder of the UNLOAD phase. The value of n is determined by how
much your data can be compressed. In most cases, this sampling technique
produces a better dictionary and might produce better results for table
spaces that contain tables with dissimilar kinds of data.
Otherwise, DB2 uses only the first n rows added to the table space to build
the contents of the dictionary.(LOAD)
AS IS SEEN FROM ABOVE,THE REORG METHOD IS BETTER THAN LOAD 
Logging effects
If a data row is compressed, all data that is logged because of SQL changes
to that data is compressed. Thus, you can expect less logging for insertions
and deletions; the amount of logging for updates varies. Applications that
are sensitive to log-related resources can experience some benefit with
compressed data.
Distributed data
DB2 decompresses data before transmitting it to VTAM.
Deciding whether to compress data
Consider these factors before compressing data:
Data row size
DB2 compresses the data of one record at a time. (The prefix of the record
is not compressed.) As row lengths become shorter, compression yields
diminishing returns because 8 bytes of overhead are required to store each
record in a data page. On the other hand, when row lengths are very long,
compression of the data portion of the row might yield little or no
reduction in data set size because DB2 rows cannot span data pages. In the
case of very long rows, using a larger page size can enhance the benefits of
compression, especially if the data is accessed primarily in a sequential
mode.
If compressing the record produces a result that is no shorter than the
original, DB2 does not compress the record.
Table space size
Compression can work very well for large table spaces. With small table
spaces, the size of the compression dictionary (64 KB) can offset the space
savings that compression provides.
Processing costs
Decompressing a row of data costs significantly less than compressing that
same row. The access path that DB2 chooses impacts the processor cost for
data compression. In general, the relative overhead of compression is
higher for table space scans and is less costlier for index access.
I/O costs
When rows are accessed sequentially, fewer I/Os might be required to
access data that is stored in a compressed table space. However, the
reduced I/O resource consumption is traded for extra processor cost for
decoding the data.
Data patterns
The frequency of patterns in the data determines the compression savings.
Data with many repeated strings (such as state and city names or numbers
with sequences of zeros) results in good compression savings.
Calculating the space that is required for a dictionary
A dictionary contains the information that is used for compressing and
decompressing the data in a table space or partition. The dictionary resides in that
same table space or partition.
To find the effectiveness of data compression:
v Use the DSN1COMP stand-alone utility to find out how much space can be
saved and how much processing the compression of your data requires. Run
DSN1COMP on a data set that contains a table space, a table space partition, or
an image copy. DSN1COMP generates a report of compression statistics but does
not compress the data.
v Examine the compression reports after you use REORG or LOAD to build the
compression dictionary and compress the data. Both utilities issue a report
message (DSNU234I or DSNU244I). The report message gives information about
how well the data is compressed and how much space is saved. (REORG with
the KEEPDICTIONARY option does not produce the report.)
v Query catalog tables to find information about data compression
– PAGESAVE column of the SYSIBM.SYSTABLEPART tells you the percentage
of pages that are saved by compressing the data.
– PCTROWCOMP columns of SYSIBM.SYSTABLES and SYSIBM.SYSTABSTATS
tells you the percentage of the rows that were compressed in the table or
partition the last time RUNSTATS was run. Use the RUNSTATS utility to
update these catalog columns.

SYNTAX OF CREATE TABLE   

COMPRESS
Specifies whether data compression applies to the rows of the table space or
partition. Do not specify COMPRESS for a LOB table space or a table space in
a work file database.
For partitioned table spaces, the COMPRESS attribute for each partition is the
value from the first of the following conditions that apply:
v The value specified in the COMPRESS clause in the PARTITION clause for
the partition
v The value specified in the COMPRESS clause that is not in any PARTITION
clause
v An implicit COMPRESS NO by default.
See DB2 Performance Monitoring and Tuning Guide for more information about
data compression.
YES
Specifies data compression. The rows are not compressed until the LOAD
or REORG utility is run on the table in the table space or partition. If
COMPRESS YES is specified, the table space is created with basic row
format.
需要注意的,表空间的属性虽然为compress属性,但是表空间不一定是压缩,只有在reorg
或是load以后才会产生压缩字典,后续插入的行才会进行压缩,先前插入的行是不压缩的。
NO Specifies no data compression for the table space or partition.


阅读(948) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~