Presented By:
V.Kartheeka
A.Kalpana
[
attachment=9754]
Abstract
Large applications executing on Grid or cluster architectures consisting of hundreds or thousands of computational nodes create problems with respect to reliability. The source of the problems is node failures and the need for dynamic configuration over extensive runtime. By allowing recovery even under different numbers of processors, the approaches are especially suitable for applications with a need for adaptive or reactionary configuration control.
The low-cost protocols offer the capability of controlling or bounding the overhead. A formal cost model is presented, followed by an experimental evaluation. It is shown that the overhead of the protocol is very small, and the maximum work lost by a crashed process is small and bounded.
Introduction
What is Grid
A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.
What is Grid Computing
A parallel processing architecture in which CPU resources are shared across a network, and all machines function as one large supercomputer
Introduction cont’d
Need for Grid Computing:
In the commercial world, grid aims to maximize the utilization of an organization's computing resources by making them shareable across applications and, potentially, provide computing on demand to third parties as a utility service.
Fault-tolerance
Compute Science, Fault-tolerance is the property of a computer system to continue operation at an acceptable quality, despite the unexpected occurrence of hardware or software failures.
Roll-back recovery
Roll-back recovery reverts the system state back to some earlier, correct version, for example using check pointing, and moves forward from there. Roll-back recovery requires that the operations between the checkpoint and the detected erroneous state can be made idempotent.
Algorithm Description
Logging can be classified as pessimistic, optimistic, or causal. It is based on the fact that the execution of a process can be modeled as a sequence of state intervals. The execution during a state interval is deterministic.
However, each state interval is initiated by a nondeterministic event now, assume that the system can capture and log sufficient information about the nondeterministic events that initiated the state interval. This is called the piecewise deterministic (PWD) assumption.
The crashed process can be recovered by
1) restoring it to the initial state and
2) replaying the logged events to it in the same order they appeared in the execution before the crash.
To avoid a rollback to the initial state of a process and to limit the amount of nondeterministic events that need to be replayed, each process periodically saves its local state.
Log based mechanisms in which the only nondeterministic events in a system are the reception of messages is usually referred to as message logging.
Existing System
Communication Induced Check-pointing protocols usually make the assumption that any process can be check-pointed at any time.
An alternative approach which releases the constraint of always check-pointable processes, without delaying any do not message reception nor did altering message ordering enforce by the communication layer or by the application.
This protocol has been implemented within Pro-Active, an open source Java middleware for asynchronous and distributed objects implementing the ASP (Asynchronous Sequential Processes) model.
Proposed System
This paper presents two fault-tolerance mechanisms called Theft-Induced Check pointing and Systematic Event Logging. These are transparent protocols capable of overcoming problems associated with both benign faults, i.e., crash faults, and node or subnet volatility.
Specifically, the protocols base the state of the execution on a dataflow graph, allowing for efficient recovery in dynamic heterogeneous systems as well as multithreaded applications.
Model Diagram
Requirements
Hardware Requirements
SYSTEM : Pentium IV 2.4 GHz
HARD DISK : 40 GB
RAM : 256 MB
Software Requirements
OPERATING SYSTEM :- Windows XP Professional
Technologies Used :- J2SE, JDBC.
DATA BASE SERVER :- MYSQL 5.1