11-07-2012, 11:27 AM
B-trees
btrees.pdf (Size: 183.66 KB / Downloads: 206)
What are B-trees?
• B-trees are balanced search trees: height = O
log(n)
for the worst case.
• They were designed to work well on Direct Access secondary storage devices
(magnetic disks).
• Similar to red-black trees, but show better performance on disk I/O operations.
• B-trees (and variants like B+ and B* trees ) are widely used in database systems.
Motivation
Data structures on secondary storage:
• Memory capacity in a computer system consists broadly on 2 parts:
1. Primary memory: uses memory chips.
2. Secondary storage: based on magnetic disks.
• Magnetic disks are cheaper and have higher capacity.
• But they are much slower because they have moving parts.
B-trees try to read as much information as possible in every disk access operation.
B-tree: definition
A B-tree T is a rooted tree (with root root[T]) with properties:
• Every node x has four fields:
1. The number of keys currently stored in node x, n[x].
2. The n[x] keys themselves, stored in nondecreasing order:
key1[x] key2[x] · · · keyn[x][x] .
3. A boolean value,
leaf[x] =
(
True if x is a leaf ,
False if x is an internal node .
4. n[x] + 1 pointers, c1[x], c2[x], . . . , cn[x]+1[x] to its children.
(As leaf nodes have no children their ci are undefined).
Splitting a node in a B-tree (I)
• Inserting a key into a B-tree is more complicated than in binary search tree.
• Splitting of a full node y (2t − 1 keys) fundamental operation during insertion.
• Splitting around median key keyt[y] into 2 nodes.
• Median key moves up into y’s parent (which has to be nonfull).
• If y is root node tree height grows by 1.
Deleting a Key — Cases I
• Considering 3 distinct cases for deletion
• Let k be the key to be deleted, x the node containing the key. Then the cases are:
1. If key k is in node x and x is a leaf, simply delete k from x
2. If key k is in node x and x is an internal node, there are three cases to consider:
(a) If the child y that precedes k in node x has at least t keys (more than the
minimum), then find the predecessor key k0 in the subtree rooted at y. Recursively
delete k0 and replace k with k0 in x
(b) Symmetrically, if the child z that follows k in node x has at least t keys, find the
successor k0 and delete and replace as before. Note that finding k0 and deleting
it can be performed in a single downward pass
© Otherwise, if both y and z have only t−1 (minimum number) keys, merge k and
all of z into y, so that both k and the pointer to z are removed from x. y now
contains 2t − 1 keys, and subsequently k is deleted