Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2885542
  • 博文数量: 599
  • 博客积分: 16398
  • 博客等级: 上将
  • 技术积分: 6875
  • 用 户 组: 普通用户
  • 注册时间: 2009-11-30 12:04
个人简介

WINDOWS下的程序员出身,偶尔也写一些linux平台下小程序, 后转行数据库行业,专注于ORACLE和DB2的运维和优化。 同时也是ios移动开发者。欢迎志同道合的朋友一起研究技术。 数据库技术交流群:58308065,23618606

文章分类

全部博文(599)

文章存档

2014年(12)

2013年(56)

2012年(199)

2011年(105)

2010年(128)

2009年(99)

分类: Oracle

2012-02-10 14:27:28

Unbalanced indexes ?

 

There are several articles on the Internet about Oracle's implementation of indexes, and the need to rebuild indexes from time to time. Somewhere in these articles there is almost always a short description about how indexes can become unbalanced, and the effect this can have.  Unfortunately, this seems to ignore the fact that the B-tree mechanism used by Oracle is a 'Balanced B-tree' index- in other words an index that cannot become unbalanced.

 

 


 

What does balanced mean ?

Given that Oracle uses Balanced  B-trees for indexes, why is it that so many people believe that their indexes can become unbalanced ?

 

And what is Balanced B-tree anyway ?

 

Perhaps the answer to the second question can tell us the answer to the first.

 

Technically the important feature about Balanced B-trees is that the distance between any leaf block and the root block is the same at any one point in time - the balance is top-to-bottom.

 

In the case of Oracle this can be seen quite easily by performing a treedump, as indicated in Fig. 1

 

select  object_id

from    user_objects

where   object_type = 'INDEX'

and object_name = 'T1_IDX1'

-- and subobject_name = . . .

;

 

alter session set events '

'immediate trace name treedump level N';

 

Figure 1  Steps involved in a tree dump.

 

First you find the object_id of the index, or index partition, that you what to dump; then you invoke the treedump event using the object_id as the level. If you examine the trace file generated by this tree dump, you will find lines like those shown in Fig.2:

 

This trace file holds a recursive descent of the index, showing branch blocks (of which the root block is just a special case) and leaf blocks. Note that the first block dumped (the root block) actually records a level, and every branch block below it records a level as well, but leaf blocks do not.

branch: 0x14001aa 20971946 (.. level 2)

  branch: 0x14003ef 20972527 (.. level 1)

    leaf: 0x14001ab 20971947 (..)

    leaf: 0x14001ac 20971948 (..)

Figure 2  Extract from tree dump.

 

The level of the root block is the blevel of the index (as recorded in the view index_stats after a validate index call). The height of the index (as recorded in the view dba_indexes after an analyze index) is just the blevel + 1.

 

Every leaf block is exactly this many steps away from the root. The index is always balanced.

 

So why do many people believe that Oracle allows indexes to become unbalanced ?

 

Mea Culpa

At this point I have to admit guilt to repeating, only a year ago (May 2002), a well-known and completely wrong, description of leaf-block splitting. And I managed to do this despite knowing (and describing in my book (Dec 2000)) the details of the treedump.

 

I have an idea that the error germinated from statements appearing in an Oracle 5 manual, many years ago, to the effect that Oracle indexes were balanced because "no leaf block was more than one step further from the root block than any other leaf block". Combine this (true, but not quite true enough) statement with a simplistic image of block splits and voila - an almost unbreakable legend is born.

 

Figures 3 and 4 portray a very common, but completely incorrect, idea of how a leaf block splits.

 

 


 

 

 Figure 3  Index just before a leaf block split.

 

A commonly held idea is that the leaf block splits to share its data between two new blocks then, to point to these two new blocks, Oracle inserts a fresh branch block where the original leaf block used to be to hold the pointers. Thus, in this erroneous view, the index is deeper on the right than on the left.  (It is often said that indexes on sequence-based columns cause the biggest problems because the theory suggests that they cause the rightmost leaf block to keep on descending further and further from the root each time it splits).


 

  

Figure 4  This is NOT how a leaf block split occurs.

 

 

In fact, the work done by Oracle is much more subtle, forward-looking and efficient. The result of a complex leaf block split is shown in Figure 5.

 

As the leaf block splits into two, Oracle tries to insert one extra pointer into the branch block that is currently pointing to that leaf. 

 

But if the branch block is full, Oracle splits the branch block in two, shares the existing leaf pointers between these two branch blocks, and (recursively) inserts a new branch pointer into the branch block one level higher up to point to the new branch block.

 

But if Oracle reaches the root block during this process, and the root block is also full, then the root block has to split as well. In which case a new root block will be created to hold two branch pointers.  (In fact, Oracle handles the root block split slightly differently from ordinary branch block splits to ensure that the root block can always be found in the same place in the physical segment, no matter how many times a root block split has occurred.

 


 

  

Figure 5   How the index looks after a recursive split.

 

Note how this recursive climb up the tree means that at all times, the index remains balanced.

 

 

Why is the myth so strong ?

Is there any reason why the legend of the unbalanced index should persist so strongly? I think the answer is yes.

 

Remember that the definition of the word 'balanced' has a very strict meaning when we are discussing B-trees. There is, however, a completely different interpretation that could be used for the word.

 

How, for example, would you describe the index in fig. 6, where the root block points to six leaf blocks but one leaf block is empty, three are nearly empty, and two are well packed.  (Note - the extra lines from the root to the leaf blocks are there to highlight the uneven distribution of the index packing, in reality there is only one pointer in the root for each leaf block).


 

 

 

 

 

 

 

 

 

Figure 6  A non-technical meaning of 'unbalanced'

 

A 'human' response to seeing this pattern in an index would, indeed, be to call it 'unbalanced'. Clearly the right hand side of the index is 'heavier' than the left.  It is unfortunate that the informal human expression should be so apt, when the technical expression means something completely different.

 

Perhaps it is this conflict between technical and informal usage that has led to the confusion.

 

Indexes on sequence-generated columns can very easily become 'unbalanced' in the informal sense - especially when they are being used to represent or process FIFO queueing mechanisms. However, even when they are (informally) unbalanced, they are still (technically) Balanced B-trees.

(Perhaps it would be a good idea to push for terms such as "skewed" or "unevenly distributed" as safer descriptions of such indexes.)

 

Sometimes it takes just a couple of articles or presentations where terms are used carelessly to establish a legend - in this case one that has resulted in numerous DBAs wasting countless hours rebuilding indexes unnecessarily.

 

Remember, the next time you decide that Oracle is doing something daft and inefficient, perhaps the problem lies in an old misunderstanding and not in the software itself.

 

Warning.

If you want to investigate further, there are a couple of problems with the treedump command. For some versions of Oracle it dumps one line for each block in the index segment - which can be quite expensive and slow since it requires an ordered visit to every block in the index.  However in Oracle 9.0 the trace file seems to do a full block dump of every block - which will be enormous and extremely slow.

 

The second problem is version-generic. If the index has been created as a consequence of a primary key or unique key constraint being defined, then the flags column in the ind$ table has bit 13 set - and this causes the treedump to crash with the error 'invalid value'. This doesn't cause problems for the partitions of a partitioned index, but for all other primary key and unique key indexes (including non-partitioned IOTs) it is a nuisance.  It is generally a good idea to create the index, and then create the constraint - this happens to bypass the problem in all cases (except for the IOT, of course). In a crunch, you could update the ind$ table directly to clear this bit - but obviously not without approval from Oracle support.

 

Conclusion

When talking about Balanced B-tree indexes, the term "balanced" means top to bottom, not left to right.

 

Oracle really does implement a version of 'Balanced B-tree indexes', so at any moment all leaf blocks in an index are exactly the same distance from the root - a distance that can be found in the blevel column of the view index_stats immediately after executing a validate index, or as the height (which equals blevel + 1) in the view user_indexes if the index has been recently analyzed.

 

Resist the argument that you need to rebuild indexes, especially sequence-based indexes, regularly because "they become unbalanced". It isn't a valid argument.

 

Acknowledgements.

This article was first published on


 


 

Jonathan Lewis is a freelance consultant with more than 18 years experience of Oracle.  He specialises in physical database design and the strategic use of the Oracle database engine, is author of 'Practical Oracle 8I - Designing Efficient Databases' published by Addison-Wesley, and is one of the best-known speakers on the UK Oracle circuit. Further details of his published papers, presentations and seminars can be found at which also hosts The Co-operative Oracle Users' FAQ for the Oracle-related Usenet newsgroups.

 

 

阅读(1332) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~