Chinaunix首页 | 论坛 | 博客
  • 博客访问: 112294
  • 博文数量: 36
  • 博客积分: 2260
  • 博客等级: 大尉
  • 技术积分: 400
  • 用 户 组: 普通用户
  • 注册时间: 2008-12-27 22:49
文章分类

全部博文(36)

文章存档

2011年(10)

2010年(26)

我的朋友

分类: Oracle

2011-02-06 15:08:11

B-tree是大量BLOCK读写的最好选择,常用于文件系统和数据库
 
B-tree的定义
1)每个node最多有M个子节点
2)每个node最少有M/2个子节点,root node除外
3)root node最少有2个node(不是leaf node)
4)每个non-leaf node有K个子节点就包含K-1的keys
5)所有的leaf node在同一level
 
对于无索引的表的平均search比较次数
假设表的记录有1000000行,利用binary search的search时间是
log21,000,000 = 19.931....
需要20次的比较
 
对于有index的表的平均search比较次数为14次,少了6次比较时间
 
对于频繁的 insert和delete (update=delete+insert),B-tree需要处理一些额外的问题
 
B-tree的最优高度和最差高度

The best case height of a B-Tree is:

.log_{m} n..

The worst case height of a B-Tree is:

.log_{m/2}n.
n为表的行数,m为一个node拥有的最多子节点
 
insertion的处理

All insertions start at a leaf node. To insert a new element

Search the tree to find the leaf node where the new element should be added. Insert the new element into that node with the following steps:

  1. If the node contains fewer than the maximum legal number of elements, then there is room for the new element. Insert the new element in the node, keeping the node's elements ordered.
  2. Otherwise the node is full, so evenly split it into two nodes.
    1. A single median is chosen from among the leaf's elements and the new element.
    2. Values less than the median are put in the new left node and values greater than the median are put in the new right node, with the median acting as a separation value.
    3. Insert the separation value in the node's parent, which may cause it to be split, and so on. If the node has no parent (i.e., the node was the root), create a new root above this node (increasing the height of the tree).

deletion的处理

There are two popular strategies for deletion from a B-Tree.

  • locate and delete the item, then restructure the tree to regain its invariants

or

  • do a single pass down the tree, but before entering (visiting) a node, restructure the tree so that once the key to be deleted is encountered, it can be deleted without triggering the need for any further restructuring

The algorithm below uses the former strategy.

There are two special cases to consider when deleting an element:

  1. the element in an internal node may be a separator for its child nodes
  2. deleting an element may put its node under the minimum number of elements and children.

Each of these cases will be dealt with in order.

 Deletion from a leaf node
  • Search for the value to delete.
  • If the value is in a leaf node, it can simply be deleted from the node,
  • If underflow happens, check siblings to either transfer a key or fuse the siblings together.
  • if deletion happened from right child retrieve the max value of left child if there is no underflow in left child
  • in vice-versa situation retrieve the min element from right
 Deletion from an internal node

Each element in an internal node acts as a separation value for two subtrees, and when such an element is deleted, two cases arise. In the first case, both of the two child nodes to the left and right of the deleted element have the minimum number of elements, namely L−1. They can then be joined into a single node with 2L−2 elements, a number which does not exceed U−1 and so is a legal node. Unless it is known that this particular B-tree does not contain duplicate data, we must then also (recursively) delete the element in question from the new node.

In the second case, one of the two child nodes contains more than the minimum number of elements. Then a new separator for those subtrees must be found. Note that the largest element in the left subtree is still less than the separator. Likewise, the smallest element in the right subtree is the smallest element which is still greater than the separator. Both of those elements are in leaf nodes, and either can be the new separator for the two subtrees.

  • If the value is in an internal node, choose a new separator (either the largest element in the left subtree or the smallest element in the right subtree), remove it from the leaf node it is in, and replace the element to be deleted with the new separator.
  • This has deleted an element from a leaf node, and so is now equivalent to the previous case.

rebalancing after deletion

参考

 

阅读(677) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~