Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: RAIN Technology
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
[attachment=71123]



ABSTRACT

The RAIN project is research collaboration between Caltech and NASA-JPL on
distributed computing and data storage systems for future space-borne missions. The goal of the
project is to identify and develop key building blocks for reliable distributed systems built with
inexpensive off-the-shelf components.
The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes
connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN
software components run in conjunction with operating system services and standard network
protocols. Through software-implemented fault tolerance, the system tolerates multiple node,
link, and switch failures, with no single point of failure.
The RAIN technology has been transferred to RAIN finity, a start-up company focusing on
creating clustered solutions for improving the performance and availability of Internet data
centers.


INTRODUCTION
RAIN technology originated in a research project at the California Institute of
Technology (Caltech), in collaboration with NASA's Jet Propulsion Laboratory and the Defense
Advanced Research Projects Agency (DARPA). The name of the original research project was
RAIN, which stands for Reliable Array of Independent Nodes. The main purpose of the RAIN
project was to identify key software building blocks for creating reliable distributed applications
using off-the-shelf hardware. The focus of the research was on high-performance, fault-tolerant
and portable clustering technology for space-borne computing. Led by Caltech professor Shuki
Bruck, the RAIN research team in 1998 formed a company called Rainfinity. Rainfinity, located
in Mountain View, Calif., is already shipping its first commercial software package derived from
the RAIN technology, and company officials plan to release several other Internet-oriented
applications. The RAIN project was started four years ago at Caltech to create an alternative to
the expensive, special-purpose computer systems used in space missions. The Caltech
Researchers wanted to put together a highly reliable and available computer system by
distributing processing across many low-cost commercial hardware and software components.
To tie these components together, the researchers created RAIN software, which has three
components:
1. A component that stores data across distributed processors and retrieves it even if some
of the processors fail.
2. A communications component that creates a redundant network between multiple
processors and supports a single, uniform way of connecting to any of the processors.
3. A computing component that automatically recovers and restarts applications if a
processor fails.


Myrinet switches provide the high speed cluster message passing network for passing messages
between compute nodes and for I/O. The Myrinet switches have a few counters that can be
accessed from an ethernet connection to the switch. These counters can be accessed to monitor
the health of the connections, cables, etc. The following information refers to the 16-port, the
clos-64 switches, and the Myrinet2000 switches.
ServerNet is a switched fabric communications link primarily used in proprietary computers
made by Tandem Computers, Compaq, and HP. Its features include good scalability, clean fault
containment, error detection and failover.
The ServerNet architecture specification defines a connection between nodes, either processor or
high performance I/O nodes such as storage devices. Tandem Computers developed the original
ServerNet architecture and protocols for use in its own proprietary computer systems starting in
1992, and released the first ServerNet systems in 1995.
Early attempts to license the technology and interface chips to other companies failed, due in part
to a disconnect between the culture of selling complete hardware / software / middleware
computer systems and that needed for selling and supporting chips and licensing technology.

A follow-on development effort ported the Virtual Interface Architecture to ServerNet with PCI
interface boards connecting personal computers. Infiniband directly inherited many ServerNet
features. After 25 years, systems still ship today based on the ServerNet architecture.
ORIGIN
1. Rain Technology developed by the California Institute of technology, in collaboration
with NASA’s Jet Propulsion laboratory and the DARPA.
2. The name of the original research project was RAIN, which stands for Reliable Array of
Independent Nodes.
3. The RAIN research team in 1998 formed a company called Rainfinity.


ARCHITECTURE
The RAIN technology incorporates a number of unique innovations as its core modules:
Reliable transport ensures the reliable communication between the nodes in the cluster. This
transport has a built-in acknowledgement scheme that ensures reliable packet delivery. It
transparently uses all available network links to reach the destination.
When it fails to do so, it alerts the upper layer, therefore functioning as a failure detector. This
module is portable to different computer platforms, operating systems and networking
environments. Consistent global state sharing protocol provides consistent group membership,
optimized information distribution and distributed group-decision making for a RAIN cluster.
This module is at the core of a RAIN cluster. It enables efficient group communication among
the computing nodes, and ensures that they operate together without conflict. Always on IP
maintains pools of "always-available" virtual IPs.
This virtual IPs is nothing but the logical addresses that can move from one node to another for
load sharing or fail-over. Usually a pool of virtual IPs is created for each subnet that the RAIN
cluster is connected to. A pool can consist of one or more virtual IPs.
Always on IP guarantees that all virtual IP addresses representing the cluster are available as
long as at least one node in the cluster is operational. In other words, when a physical node fails
in the cluster, its virtual IP will be taken over by another healthy node in the cluster.
Local and global fault monitors monitor, on a continuous or event-driven basis, the critical
resources within and around the cluster: network connections, Rainfinity or other applications
residing on the nodes, remote nodes or applications.
It is an integral part of the RAIN technology, guaranteeing the healthy operation of the cluster.