DB2中 常见的可能会用到DFSORT的utility有:
The following DB2 Utilities use DFSORT exclusively1 to perform their SORT and MERGE functions:
- LOAD
- REORG TABLESPACE and REORG INDEX
- REBUILD INDEX
- RUNSTATS
- CHECK DATA, CHECK INDEX, and CHECK LOB
2 DB2同DFSORT之间的关系:
The most important parameters DB2 Utilities passes to DFSORT are related to storage usage and the
number of work data sets. If appropriate, you can use the DB2 SORTNUM parameter to change the
number of work data sets for each sort (e.g., SORTNUM=8 specifies 8 work data sets per sort). But
you cannot use DFSORT's DFSPARM data set to change the MAINSIZE value (storage) or
DYNALLOC n value (number of work data sets) used by DB2 Utilities at run-time because DB2
Utilities will not be aware of the changes in these values.
3 each sort invocation :
You need at least two sort work data sets for each sort. The SORTNUM value applies to each sort
invocation in the utility. For example, if there are three indexes, SORTKEYS is specified, there are no
constraints limiting parallelism, and SORTNUM is specified as 8, then a total of 24 sort work data sets
will be allocated for the job.
4 关于作业并发的情况:
Each sort work data set consumes both above the line and below the line virtual storage, so if you
specify too high a value for SORTNUM, the utility may decrease the degree of parallelism due
to virtual storage constraints, and possibly decreasing the degree down to one, meaning no
parallelism. To determine the size of the sort datasets, search for "Calculating the size of the sort work
data sets" under each utility that performs sorts in the "DB2 Utility Guide and Reference."
5 DFSORT所使用的device type:
Note that you can also use the DB2 SORTDEVT parameter to change the device type of the work data
sets (e.g. SORTDEVT=WORK specifies WORK as the device type).
数据量同实存的关系:
Data size Estimated main memory use per DFSORT task
1 GB 10 MB
10 GB 30 MB
100 GB 70 MB
where data size is the number of bytes to be sorted, e.g., record count * record length.
Increasing the region size of the batch Utility job to accommodate the main memory needed for the
parallel DFSORT tasks can allow the sorts to succeed.
DFSORT所使用的work data set:
Temporary datasets
SORTWKnn: Temporary datasets for sort input and output when sorting keys if Parallel Index
Build (PIB) is not used. nn identifies one or more data sets that are to be used by the sort task for
LOAD, REBUILD, and REORG. The SORTWK01 data set is also used when collecting distribution
statistics by RUNSTATS. The used message data set is UTPRINT.
SWnnWKmm: Temporary datasets for sort input and output when sorting keys if Parallel Index
Build (PIB) is used. nn identifies the subtask pair (one per index), and mm identifies one or more
data sets that are to be used by that subtask pair when executing LOAD, REBUILD, or REORG.
The used message data set is UTPRINnn.
DATAWKnn: Temporary data sets for sort input and output used by REORG for the data sort. nn
identifies one or more data sets that are to be used by the sort task. The used message data set is
UTPRINT.
DannWKmm: Temporary datasets used by REORG for unload parallelism. nn identifies the
subtask pair (one per partition), and mm identifies one or more data sets that are to be used by
that subtask pair. The used message data sets are DTPRINnn.
STATWK01: Temporary datasets for sort input and output when collecting distribution statistics for
columns groups (RUNSTATS and inline statistics). The used message data set is RNPRIN01.
ST01WKnn: Temporary datasets for sort input and output when collecting frequency statistics on
DPSIs or when TABLESPACE TABLE COLGROUP FREQVAL is specified (RUNSTATS and
inline statistics). The used message data set is STPRIN01.
ST02WKnn: Temporary datasets for sort input and output when collecting frequency statistics for
column groups (RUNSTATS and inline statistics).
Informational output datasets
? UTPRINT: A dataset that contains messages from DFSORT. Required for the LOAD, REBUILD,
and REORG utilities.
? UTPRINnn: Data sets that contain messages from DFSORT from utility subtasks pairs. These
data sets are dynamically allocated when UTPRINT is allocated to SYSOUT.
? DTPRINnn: Datasets that contain messages from UNLOAD parallelism functions. These data sets
are dynamically allocated when UTPRINT is allocated to SYSOUT.
? STPRIN01: A dataset that contains messages from DFSORT. This data set is required when
frequency statistics are collected on DPSIs or when TABLESPACE TABLE COLGROUP
FREQVAL is specified.
RNPRIN01: A dataset that contains messages from DFSORT. This data set is required when
distribution statistics are collected for column groups.
Other datasets:
SORTDEVT: Gives DB2 the device type of where to allocate the sortwork data sets. The
specification of this keyword allows DB2 to perform dynamic data set allocation, unless the user
overrides it with their own DFSORT DYNALLOC parameter. For REORG unload/reload partition
parallelism, DB2 requires the specification of SORTDEVT so that the optimal degree of parallelism
can be determined.
sort 并发的查询:
The number of parallel DFSORT tasks for Utilities may be limited by allocating the UTPRINxx data
sets to something other than SYSOUT *. E.g., if UTPRIN01, UTPRIN02, and UTPRIN03 are
allocated to real data sets then at most 3 parallel sort tasks would be initiated by the Utility.sort的必知Sort work data sets cannot span volumes.的内容:
Smaller volumes require more sort work data sets to sort
the same amount of data; therefore, large volume sizes can reduce the number of needed sort work
data sets. Since z/OS V1.7, it is possible to define volumes with more than 64,000 tracks. IBM
recommends that at least 1.2 times the amount of data to be sorted be provided in sort work data sets
on disk. Using two or three large SORTWKnn data sets are preferable to several small ones.
DFSORT runs most efficiently with a small number of data sets. A small number of data sets also
benefits the maximum degree of parallelism for utilities, so you can likely improve the elapsed time, if
you have large sort volumes to reduce the number of data sets for very large sorts.
SORTNUM tells DB2 how many sort work data sets to use for each sort invocation. This is only
passed on to DFSORT in the DYNALLOC option statement and tells DFSORT into how many data
sets the sort work space should be divided. DB2 allocated sort work data sets do not use this number;
they are always allocated as large as possible to result in the smallest possible number of data sets.
ref:
阅读(2334) | 评论(0) | 转发(0) |