23-01-2013, 10:18 AM
B-Tree File System
1B-Tree.pdf (Size: 426.03 KB / Downloads: 97)
Abstract
B-Trees are used in many file-systems to represent files or directories. B-Trees
ensures the logarithmic time key-search, insert and remove operations. Shadowing is
another powerful mechanism to ensure the atomic updates to pages on the disk. But
the combination of these two techniques proves to be expensive if conventional B-Tree
structure is used. There are some serious performance issues if conventional B-Tree is used
along with shadowing because everything is being performed on disk, which is a slower
device. B-Tree file-system uses these two techniques, but B-Tree is slightly modified to
be compatible with shadowing technique. We will discuss some of these modifications.
This report covers the introduction to B-Trees and shadowing, and design, some data
structures, features, specification of B-Tree file-system.
Introduction to B-Trees and Shadowing
This chapter covers basic structure of B-Tree with an example, and concept of
Shadowing followed by the uses of the same.
B-Trees
B-Tree is the generalization of the binary search tree. B+-Tree can be considered
as B-Tree variant, with an exception that in B+-Tree only leafs contain the data. In
binary search trees, we have nodes having single search-key and left sub-tree and right
sub-tree containing all nodes with search-keys that are less than and greater than parent
search-key respectively. In B+-Tree, we can have multiple search-keys, and multiple child
nodes. Figure 1.1 shows the example of binary search tree.
Shadowing
Shadowing scheme is also known as copy-on-write (COW) scheme. Shadowing technique
is used to ensure atomic update to persistent data-structures in file-system. In
this scheme, we look at the storage in terms of fixed-size pages. There is a page table
which has a pointer to all valid pages. Shadowing means that to update an on-disk page,
the entire page is read into memory, modified, and later written to disk at an alternate
location. Now all we have to do is to update the pointer in the page table to point to
this new page in the disk. Byte-size of pointer is small and it can fit is one sector in
the disk. There are hard drives that offer atomic sector upgrades and promise you that
either all of the old or new data in the sector. This means you either have an old page or
newly written page. So atomic persistent updates are ensured due
B-Trees + Shadowing
This chapter covers the problems that arise when we try to combine the b-trees
and the shadowing. Ohad Rodeh, IBM Researcher, have addressed these problems, and
also suggested modifications to conventional b-tree. We will cover solution to few of the
problems in the text below.
Problems with conventional B-Trees
The entire file-system tree on the disk can be looked as made of fixed-size pages.
When a page is to be modified, it is read into memory, modified, and later written to
some other location in the disk. Now let us assume that the leaf in b-tree shown below is
equivalent to one one-disk page. If we try to modify the leaf, then page corresponding to
the leaf will be shadowed. Now, the next immediate ancestor of this node should point to
this node. This means we will have to modify the ancestor of this node. Again shadowing
is involved, and this process continues up to the root recursively. So entire path up to the
root need to be shadowed. We will call this type of shadowing as strict shadowing.
Now the one additional problem arises due to linking of the leafs in tree. Since
adjacent leaf should also point to the modified leaf, it is also needed to be shadowed.
This process leads to shadowing of the entire tree just because of modification in one
leaf. Remember, this all is going to happen in the hard-disks! This lead to performance
degradation. The root of the problem is leaf chaining.
To solve the issues related to concurrency, we use mutex locks or semaphores.
Now, let’s assume for while that there are no links in leafs. In normal b-tree, suppose we
need to modify a single node, we take a lock on it, make changes and then release the
lock. But if method of updation is shadowing, then we know that changes propagate to
the root, making it necessary to take locks on the way up to the tree root. So there is
race to take the lock on the tree root. Waiting for lock is time consuming process, and
hence there is need of efficient synchronization.
Modifications in conventional B-tree
Ohad Rodeh, IBM Researcher, have suggested modifications to conventional b-tree
and algorithms related to it, for integrating b-tree schemes with shadowing technique. We
will cover few of them, related to problems discussed above.
1. To solve the problem related to shadowing of whole tree, the links between the leafs
are removed. Due to this, only the path up to the tree root needs to be shadowed.
2. In case of rebalancing operation, it is better to exchange the keys between nodes
whose immediate ancestor is same, because this will involve shadowing of single
node, which is better instead of shadowing the another path up to the root involving
many nodes.