30-01-2016, 12:30 PM
ABSTRACT
Large applications executing on Grid or cluster architectures consisting of computational nodes create problems with reliability. The source of the problems is node failures and the need for dynamic configuration over extensive runtime. This paper presents two mechanisms called Theft-Induced Check pointing and Systematic Event Logging. These are transparent protocols capable of overcoming problems associated with both faults, i.e., crash faults, and node or subnet volatility. Specifically, the protocols base the state of the execution on a dataflow graph, allowing for efficient recovery in dynamic heterogeneous systems as well as multithreaded applications. By allowing recovery even under different numbers of processors, the approaches are especially suitable for applications.
INTRODUCTION
GRID and cluster architectures have gained popularity for computationally intensive parallel applications. However,the complexity of the infrastructure, consisting of computational nodes, mass storage, and interconnection networks, poses great challenges with respect to overall system reliability. Simple tools of reliability analysis show that as the complexity of the system increases, its reliability,and thus, Mean Time to Failure (MTTF), decreases. If one models the system as a series reliability block diagram [30],the reliability of the entire system is computed as theproduct of the reliabilities of all system components. For applications executing on large clusters or a Grid, e.g.,Grid5000 [13], the long execution times may exceed the MTTF of the infrastructure and, thus, render the execution infeasible. As an example, let us consider an execution lasting 10 days in a system that does not consider faulttolerance. Under the optimistic assumption that the MTTF of a single node is 2,000 days, the probability of failure of this long execution using 100, 200, or 500 nodes is 0.39, 0.63,or 0.91, respectively, approaching fast certain failure. The high failure probabilities are due to the fact that, in the absence of fault-tolerance mechanisms, the failure of a single node will cause the entire execution to fail. Note that this simple example does not even consider network failures, which are typically more likely than computer failure. Fault tolerance is, thus, a necessity to avoid failure in large applications, such as found in scientific computing, executing on a Grid, or large cluster.
2.2 EXISTING SYSTEM
Grid or cluster architectures consisting of hundreds or thousands of computational nodes create problems.
The problems are node failures and the need for dynamic configuration over extensive runtime.
To overcome the problem of applications executing in large systems ,the MTTF(MeanTimeToFailure) approaches or sinks below the execution time of the application and could not solve the fault tolerance problems in grid computing.
2.3 PROPOSED SYSTEM
These protocols execute on a dataflow graph, allowing for efficient recovery in dynamic heterogeneous systems as well as multithreaded applications.
Allowing recovery even under different numbers of processors, the approaches are especially suitable for applications.Low-cost protocols offer the capability of controlling or bounding the overhead,the protocol is very small,The maximum work lost by a crashed process is small and bounded.
3.1 HARDWARE CONFIGURATION
• PROCESSOR : Pentium IV 2.4 GHz
• HARD DISK : 40 GB
• RAM : 521 MB
3.2 SOFTWARE CONFIGURATION
• Operating system :- Windows XP Professional
• Front End :- JAVA
• Back End :-MySql
1.1 PROJECT DESCRIPTION
The project entitled as “Flexible Rollback Recovery in Dynamic Heterogeneous Grid Computing ” developed using Java Modules display as follows.
• Login/Registration
• Node Analysis
• Node Selection
• Transmitting data
MODULES DESCRIPTION:
1) Login / Registration:
This is a module mainly designed to provide the authority to a user in order to access the other modules of the project. Here a user can have the accessibility authority after the registration.
2) Node Analysis
There will be a cluster of node analyzing which is efficient node.
3) Node Selection
The process of selecting the node from cluster.
4) Transmitting Data:
The data are transmitted through the selectet node..