11-02-2013, 03:03 PM
Integrity in Distributed Database
1Integrity in Distributed.pdf (Size: 178.41 KB / Downloads: 21)
Abstract :
Distributed storage systems must provide highly available access to data while maintaining high performance and maximum scalability. In addition, reliability in a storage system is of the utmost impor- tance and the correctness and availability of data must be guaranteed. We have designed the Sigma cluster file system to address these goals by distributing data across multiple nodes and keeping parity across these nodes. With data spread across multiple nodes, how- ever, ensuring the consistency of the data requires spe- cial techniques. In this paper, we describe fault toler- ant algorithms to maintain the consistency and relia- bility of the file system - both data and metadata. We show how these techniques guarantee data integrity and availability even under failure mode scenarios.
Introduction
The traditional storage solution has typically been direct attached storage (DAS) where the actual disk hardware is directly connected to the appli- cation server through high-speed channels such as SCSI or IDE. With the proliferation of lo- cal area networks, the use of network file servers has increased, leading to the development of sev- eral distributed file systems that make the local server DAS file system visible to other machines on the network. These include AFS/Coda [1], NFS [2], Sprite [3], CIFS [4], amongst others. However, with a single file server for a large num- ber of clients, the scalability is limited. In an ef- fort to remove the bottleneck of the single server model, there has lately been significant work in the area of clustered or distributed storage systems. These include distributing data amongst dedicated
Sigma Cluster File System
The Sigma Cluster File System (SCFS) is based on a clustered storage architecture. The physical layout is shown in Figure 1. Clients can connect to the Sigma cluster using a distributed file system protocol such as NFS or CIFS. However, unlike a traditional single-server model, the client can con- nect to any of the nodes in the cluster and still see the same unified file system. The multiple nodes in the cluster allow the Sigma system to eliminate the single-server bottleneck of a traditional file server system. In this model, applications do not run on the cluster nodes but run on client nodes and com- municate with the cluster storage through NFS or CIFS protocols.
NFS and CIFS data requests are translated through a POSIX-like API called the clientlib. The clientlib takes these requests and conveys them to the SCFS, which is distributed across the nodes of the cluster. The SCFS is responsible for the file management as well as data distribution, i.e. strip- ing data across nodes using varying redundancy policies.
Virtual Device Controller
We have introduced a concurrency control mecha- nism similar to the device-served locking method described in [11]. In order to ensure safe access to the files, we introduce a file-based locking mech- anism. Each file is mapped to a per-file abstract object called a virtual device which is in turn man- aged by exactly one of a set of virtual device con- trollers (VDCs) distributed throughout the cluster. A VDC is one of a set of processes instantiated on each node and at any point in time, a VDC may host zero, one, or more virtual devices.
Global Controller
We now describe in more detail the GC, and how the GC interacts with the VDCs processes running on every machine. This protocol had several prin- cipal design goals. Two VDCs must never host the same VD at the same time. It must handle machine, network, and possibly software failures, without generating inconsistent states and it must be effi- cient.
The GC must be instantiated by the cluster on one and only one machine. This instantia- tion process is an example of the well known leader election problem. There has been signifi- cant work in this area and our algorithm is simi- lar to previous work [15]. We elect the GC using an asynchronous-iterative election protocol.
Conclusions
In this paper, we have presented a series of algo- rithms for a fault-tolerant distributed storage sys- tem. The algorithms preserve data and file system integrity and consistency in the presence of concur- rent reads and writes. We have shown how these protocols are designed to be tolerant of single-fault failures both transient and permanent. The proto- cols are designed with fault-tolerance as a principal