Integrity in Distributed Database pdf

**study tips** · 11-02-2013, 03:03 PM

Integrity in Distributed Database

.pdf

1Integrity in Distributed.pdf (Size: 178.41 KB / Downloads: 21)

Abstract :

Distributed storage systems must provide highly available access to data while maintaining high performance and maximum scalability. In addition, reliability in a storage system is of the utmost impor- tance and the correctness and availability of data must be guaranteed. We have designed the Sigma cluster file system to address these goals by distributing data across multiple nodes and keeping parity across these nodes. With data spread across multiple nodes, how- ever, ensuring the consistency of the data requires spe- cial techniques. In this paper, we describe fault toler- ant algorithms to maintain the consistency and relia- bility of the file system - both data and metadata. We show how these techniques guarantee data integrity and availability even under failure mode scenarios.

Introduction

The traditional storage solution has typically been direct attached storage (DAS) where the actual disk hardware is directly connected to the appli- cation server through high-speed channels such as SCSI or IDE. With the proliferation of lo- cal area networks, the use of network file servers has increased, leading to the development of sev- eral distributed file systems that make the local server DAS file system visible to other machines on the network. These include AFS/Coda [1], NFS [2], Sprite [3], CIFS [4], amongst others. However, with a single file server for a large num- ber of clients, the scalability is limited. In an ef- fort to remove the bottleneck of the single server model, there has lately been significant work in the area of clustered or distributed storage systems. These include distributing data amongst dedicated

Sigma Cluster File System

The Sigma Cluster File System (SCFS) is based on a clustered storage architecture. The physical layout is shown in Figure 1. Clients can connect to the Sigma cluster using a distributed file system protocol such as NFS or CIFS. However, unlike a traditional single-server model, the client can con- nect to any of the nodes in the cluster and still see the same unified file system. The multiple nodes in the cluster allow the Sigma system to eliminate the single-server bottleneck of a traditional file server system. In this model, applications do not run on the cluster nodes but run on client nodes and com- municate with the cluster storage through NFS or CIFS protocols.
NFS and CIFS data requests are translated through a POSIX-like API called the clientlib. The clientlib takes these requests and conveys them to the SCFS, which is distributed across the nodes of the cluster. The SCFS is responsible for the file management as well as data distribution, i.e. strip- ing data across nodes using varying redundancy policies.

Virtual Device Controller

We have introduced a concurrency control mecha- nism similar to the device-served locking method described in [11]. In order to ensure safe access to the files, we introduce a file-based locking mech- anism. Each file is mapped to a per-file abstract object called a virtual device which is in turn man- aged by exactly one of a set of virtual device con- trollers (VDCs) distributed throughout the cluster. A VDC is one of a set of processes instantiated on each node and at any point in time, a VDC may host zero, one, or more virtual devices.

Global Controller

We now describe in more detail the GC, and how the GC interacts with the VDCs processes running on every machine. This protocol had several prin- cipal design goals. Two VDCs must never host the same VD at the same time. It must handle machine, network, and possibly software failures, without generating inconsistent states and it must be effi- cient.
The GC must be instantiated by the cluster on one and only one machine. This instantia- tion process is an example of the well known leader election problem. There has been signifi- cant work in this area and our algorithm is simi- lar to previous work [15]. We elect the GC using an asynchronous-iterative election protocol.

Conclusions

In this paper, we have presented a series of algo- rithms for a fault-tolerant distributed storage sys- tem. The algorithms preserve data and file system integrity and consistency in the presence of concur- rent reads and writes. We have shown how these protocols are designed to be tolerant of single-fault failures both transient and permanent. The proto- cols are designed with fault-tolerance as a principal

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Software Crisis pdf	study tips	1	2,117	21-09-2017, 04:31 PM Last Post: jaseela123
	HOW EMAIL WORKS pdf	project girl	1	3,067	20-09-2017, 11:39 AM Last Post: jaseela123
	Cyber crime detection, investigation and prosecution pdf	seminar projects maker	1	958	20-09-2017, 11:31 AM Last Post: jaseela123
	Review: Context Aware Tools for Smart Home Development pdf	study tips	1	1,227	20-09-2017, 11:22 AM Last Post: jaseela123
	Getting Started with the MAXQ1103 Evaluation Kit and the CrossWorks Compiler pdf	project girl	1	969	15-09-2017, 03:11 PM Last Post: jaseela123
	Wireless Application Protocol (WAP) pdf	project girl	1	1,531	15-09-2017, 02:42 PM Last Post: jaseela123
	MAC Protocol for Reliable Multicast over Multi-Hop Wireless Ad Hoc Networks pdf	study tips	1	1,029	15-09-2017, 12:39 PM Last Post: jaseela123
	Wireless Automotive Communications pdf	seminar projects maker	1	637	14-09-2017, 01:27 PM Last Post: jaseela123
	Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data pdf	study tips	1	2,018	13-09-2017, 12:59 PM Last Post: jaseela123
	Internetworking connectionless and connection-oriented networks pdf	project girl	1	1,151	13-09-2017, 11:03 AM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.