Compressing your data-kinfinger-ChinaUnix博客

kinfingerasage.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

kinfinger

博客访问： 1804315
博文数量： 335
博客积分： 4690
博客等级：上校
技术积分： 4341
用户组：普通用户
注册时间： 2010-05-08 21:38

个人简介

无聊之人--除了技术，还是技术，你懂得

文章分类

全部博文（335）

ORACEL（4）
REDIS（1）
REDIS（0）
LINUX/UNIX（4）
PHP（1）
COGNOS（1）
COBOL（4）
CICS（3）
EXCEL（2）
MYSQL（7）
DB2（72）
TWS（0）
SA（0）
mainframe（16）
web（10）

javascript（1）
APUE（43）
REXX（7）
work（13）
life（13）
python（95）
c/c++（26）
asage（7）
未分配的博文（6）

文章存档

2016年（29）

2015年（18）

2014年（7）

2013年（86）

2012年（90）

2011年（105）

我的朋友

相关博文

Compressing your data

分类： DB2/Informix

2012-11-26 15:16:00

advantage of compress data

You can reduce the space required for a table by using data compression.

Compressing the data in a table space can significantly reduce the amount of disk

space that is needed to store data and can help improve buffer pool performance.

With compressed data, you might see some of the following performance benefits,

depending on the SQL work load and the amount of compression:

v Higher buffer pool hit ratios

v Fewer I/Os

v Fewer getpage operations

basic concept of compress dictionary

When you compress data, bit strings that occur frequently are replaced by shorter

strings. Information about the mapping of bit strings to their replacements is

stored in a compression dictionary. Computer processing is required to compress

data before it is stored and to decompress the data when it is retrieved from

storage. In many cases, using the COMPRESS clause can significantly reduce the

amount of disk space needed to store data, but the compression ratio that you

achieve depends on the characteristics of your data.

step to copreess your data

To compress data:

1. Specify COMPRESS YES in one of the following SQL statements:

v CREATE TABLESPACE

v ALTER TABLESPACE

2. Populate the table space with data by taking one of the following actions:

v Run the LOAD utility with REPLACE, RESUME NO, or RESUME YES (if the

tablespace contains no rows) and without KEEPDICTIONARY.

v Run the REORG utility without KEEPDICTIONARY.

If no compression dictionary already exists, and the amount of data in the

tables space reaches a threshold determined by DB2, a compression dictionary

is created. After the compression dictionary is built, DB2 uses it to compress all

subsequent data added to the table space.

If no compression dictionary already exists, and the amount of data in the

tables space reaches a threshold determined by DB2, a compression dictionary

is created. After the compression dictionary is built, DB2 uses it to compress all

subsequent data added to the table space.

TWO WAYS OF POPULATE COMPRESS DICTIONARY

If you use the REORG utility to build the compression dictionary, DB2 uses

a sampling technique to build the dictionary. This technique uses the first n

rows from the table space and then continues to sample rows for the

remainder of the UNLOAD phase. The value of n is determined by how

much your data can be compressed. In most cases, this sampling technique

produces a better dictionary and might produce better results for table

spaces that contain tables with dissimilar kinds of data.

Otherwise, DB2 uses only the first n rows added to the table space to build

the contents of the dictionary.(LOAD)

AS IS SEEN FROM ABOVE,THE REORG METHOD IS BETTER THAN LOAD

Logging effects

If a data row is compressed, all data that is logged because of SQL changes

to that data is compressed. Thus, you can expect less logging for insertions

and deletions; the amount of logging for updates varies. Applications that

are sensitive to log-related resources can experience some benefit with

compressed data.

Distributed data

DB2 decompresses data before transmitting it to VTAM.

Deciding whether to compress data

Consider these factors before compressing data:

Data row size

DB2 compresses the data of one record at a time. (The prefix of the record

is not compressed.) As row lengths become shorter, compression yields

diminishing returns because 8 bytes of overhead are required to store each

record in a data page. On the other hand, when row lengths are very long,

compression of the data portion of the row might yield little or no

reduction in data set size because DB2 rows cannot span data pages. In the

case of very long rows, using a larger page size can enhance the benefits of

compression, especially if the data is accessed primarily in a sequential

mode.

If compressing the record produces a result that is no shorter than the

original, DB2 does not compress the record.

Table space size

Compression can work very well for large table spaces. With small table

spaces, the size of the compression dictionary (64 KB) can offset the space

savings that compression provides.

Processing costs

Decompressing a row of data costs significantly less than compressing that

same row. The access path that DB2 chooses impacts the processor cost for

data compression. In general, the relative overhead of compression is

higher for table space scans and is less costlier for index access.

I/O costs

When rows are accessed sequentially, fewer I/Os might be required to

access data that is stored in a compressed table space. However, the

reduced I/O resource consumption is traded for extra processor cost for

decoding the data.

Data patterns

The frequency of patterns in the data determines the compression savings.

Data with many repeated strings (such as state and city names or numbers

with sequences of zeros) results in good compression savings.

Calculating the space that is required for a dictionary

A dictionary contains the information that is used for compressing and

decompressing the data in a table space or partition. The dictionary resides in that

same table space or partition.

To find the effectiveness of data compression:

v Use the DSN1COMP stand-alone utility to find out how much space can be

saved and how much processing the compression of your data requires. Run

DSN1COMP on a data set that contains a table space, a table space partition, or

an image copy. DSN1COMP generates a report of compression statistics but does

not compress the data.

v Examine the compression reports after you use REORG or LOAD to build the

compression dictionary and compress the data. Both utilities issue a report

message (DSNU234I or DSNU244I). The report message gives information about

how well the data is compressed and how much space is saved. (REORG with

the KEEPDICTIONARY option does not produce the report.)

v Query catalog tables to find information about data compression

– PAGESAVE column of the SYSIBM.SYSTABLEPART tells you the percentage

of pages that are saved by compressing the data.

– PCTROWCOMP columns of SYSIBM.SYSTABLES and SYSIBM.SYSTABSTATS

tells you the percentage of the rows that were compressed in the table or

partition the last time RUNSTATS was run. Use the RUNSTATS utility to

update these catalog columns.

SYNTAX OF CREATE TABLE

COMPRESS

Specifies whether data compression applies to the rows of the table space or

partition. Do not specify COMPRESS for a LOB table space or a table space in

a work file database.

For partitioned table spaces, the COMPRESS attribute for each partition is the

value from the first of the following conditions that apply:

v The value specified in the COMPRESS clause in the PARTITION clause for

the partition

v The value specified in the COMPRESS clause that is not in any PARTITION

clause

v An implicit COMPRESS NO by default.

See DB2 Performance Monitoring and Tuning Guide for more information about

data compression.

YES

Specifies data compression. The rows are not compressed until the LOAD

or REORG utility is run on the table in the table space or partition. If

COMPRESS YES is specified, the table space is created with basic row

format.

需要注意的，表空间的属性虽然为compress属性，但是表空间不一定是压缩，只有在reorg

或是load以后才会产生压缩字典，后续插入的行才会进行压缩，先前插入的行是不压缩的。

NO Specifies no data compression for the table space or partition.

阅读(963) | 评论(0) | 转发(0) |

上一篇：mysql replication---master/salve IN ONE PC

下一篇：What does SELECT do in the background

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6