需要知道的是Prefetch 一种异步I/O技术,有三种方法:
dynamic,list,sequential
同步I/O有两种,sequential /random syn
DB2会根据访问page量的不同,来采取何种异步I/O技术。
Prefetch is a mechanism for reading a set of pages, usually 128KB, into the buffer
pool with only one asynchronous I/O operation.
Prefetch can allow substantial savings in both CPU and I/O by avoiding costly
synchronous read I/O operations. To achieve those savings, you can monitor the
use of prefetch, and take steps to enable your queries to take advantage of optimal
prefetch access paths.
DB2 uses a process called sequential detection to determine whether to use
prefetch and the type of prefetch to use. DB2 uses different types of prefetch in
different situations, including dynamic prefetch, sequential prefetch, and list
prefetch.
Dynamic prefetch
With dynamic prefetch, DB2 uses a sequential detection algorithm to determine
whether data pages are being read sequentially.
Dynamic prefetch offers improved performance over sequential prefetch,
especially when DB2 cannot detect whether the pages can be accessed sequentially
(because the catalog statistics might not always be correct). When DB2 chooses
dynamic prefetch, the value of the PREFETCH column is normally set to 'D.'
However, DB2 might also use dynamic prefetch when the value of PREFETCH
indicates 'S.'
When dynamic prefetch is used
DB2 uses dynamic prefetch to read data pages whenever an index is used to
determine which data pages contain the required rows.
DB2 also uses dynamic prefetch to avoid synchronous I/O for index pages when it
scans the leaf pages an index in key-sequential order. An organized index is likely
to trigger dynamic prefetch I/Os. However, when the leaf pages of an index are
not well organized, dynamic prefetch is not triggered, and some synchronous I/O
might occur. However, when DB2 detects the synchronous I/OS it can switch to
list prefetch for the next set of leaf pages. In that case, synchronous I/O occurs
only for the non-leaf pages.
Sequential prefetch
Sequential prefetch is used for table scans. The maximum number of pages read by a
request issued from your application program is determined by the size of the
buffer pool used.
PSPI For certain utilities (LOAD, REORG, RECOVER), the prefetch quantity can
be twice as much.
When sequential prefetch is used
Sequential prefetch is used only for table space scans. When DB2 expects that
sequential prefetch will probably be used, the value of the PREFETCH column is
'S.' However, DB2 might use dynamic prefetch in some cases if when the value of
PREFETCH column is 'S.'
List prefetch
List prefetch reads a set of data pages determined by a list of row IDs taken from an
index.
PSPI The data pages need not be contiguous. The maximum number of pages
that can be retrieved in a single list prefetch is 32 (64 for utilities). The value of the
PREFETCH column is set to 'L' when DB2 expects to use list prefetch.
List prefetch can be used with either single or multiple index access.
When the optimizer chooses the list prefetch access path, DB2 uses the following
process:
1. Retrieve the list of rows identifiers through a matching index scan on one or
more index.
2. Sort the list of row identifiers in ascending order by page number.
3. Prefetch the pages in order, by using the sorted list of row identifiers.
List prefetch does not preserve the data ordering given by the index. Because the
row IDs are sorted in page number order before accessing the data, the data is not
retrieved in order by any column. If the data must be ordered for an ORDER BY
clause or any other reason, it requires an additional sort.
When list prefetch is used
List prefetch is used:
v Typically with a single index that has a cluster ratio lower than 80%
v Sometimes on indexes with a high cluster ratio, if the estimated amount of data
to be accessed is too small to make sequential prefetch efficient, but large
enough to require more than one regular read
v Always to access data by multiple index access
v Always to access data from the inner table during a hybrid join
v Typically for updatable cursors when the index contains columns that might be
updated.
Advantages and disadvantages of list prefetch
List prefetch is most useful for skip-sequential access when a number of
non-sequential data pages are accessed in sequential order, but with intervening
pages that do not contain needed data. In such cases, dynamic prefetch reads all of
the intervening pages, the number of asynchronous pages read exceeds the number
of get page operations, and the buffer pool is not used efficiently. List prefetch
offers the following specific advantages over dynamic prefetch:
v List prefetch uses buffers very economically.
v List prefetch is not sensitive to index cluster ratio and performs much better
than dynamic prefetch when the data getpages are sparse.
v List prefetch never uses two getpages for the same data page.
v If several data pages need to be skipped, list prefetch minimizes the channel
time, enabling faster I/O than dynamic prefetch if the control unit hit ration is
high.
v For some types of control units, list prefetch is faster than sequential I/O for
skip-sequential access. You can check with your storage vendor to learn whether
that is true for your type of control unit.
Sorts for GROUP BY and ORDER BY
These sorts are indicated by SORTC_ORDERBY, and SORTC_GROUPBY in
PLAN_TABLE.
Chapter 56. Interpreting data access by using EXPLAIN 729
|
If the statement includes both a GROUP BY clause and an ORDER BY clause, and
if every item in the ORDER-BY list is in the GROUP-BY list, then only one sort is
performed, which is marked as SORTC_ORDERBY.
The performance of the sort by the GROUP BY clause is improved when the query
accesses a single table and when the GROUP BY column has no index.
Sorts to remove duplicates
This type of sort is used to process a query with SELECT DISTINCT, with a set
function such as COUNT(DISTINCT COL1), or to remove duplicates in UNION
processing. It is indicated by SORTC_UNIQ in PLAN_TABLE.
Sorts used in join processing
For hybrid join (METHOD 4) and nested loop join (METHOD 1), the composite
table can be sorted to make the join more efficient. For merge join (METHOD 2),
both the composite table and new table need to be sorted unless an index is used
for accessing these tables that gives the correct order already. The sorts needed for
join processing are indicated by SORTN_JOIN and SORTC_JOIN in the
PLAN_TABLE.
Nested loop join
SORTN_JOIN is only valid in support of star join. When SORTN_JOIN = Y
for nested loop join, the qualified rows from the dimension or snowflake
are sorted into the fact table join column sequence. A sparse index might
also be created as a result of the sort to support efficient feedback loop
skipping processing which is part of the star-join execution.
Sort merge join
The new table is accessed and sorted into join column sequence.
Hybrid join
When SORTN_JOIN is on, the intermediate table is sorted into inner table
rid sequence to support efficient list prefetch access to the inner table data.
Sorts of RIDs
To perform list prefetch, DB2 sorts RIDs into ascending page number order. This
sort is very fast and is done totally in memory.
PSPI A RID sort is usually not indicated in the PLAN_TABLE, but a RID sort
normally is performed whenever list prefetch is used. The only exception to this
rule is when a hybrid join is performed and a single, highly clustered index is used
on the inner table. In this case SORTN_JOIN is 'N', indicating that the RID list for
the inner table was not sorted.