29-10-2012, 02:05 PM
BEOWULF CLUSTER
BEOWULF.doc (Size: 304.5 KB / Downloads: 31)
INTRODUCTION
Clusters are essentially a group of computers connected over a network , which work in tandem to look like a single computer to an outside user .they are also called Network-of-Workstation(NoW) or Pile-of –PCs(PoPC).This is the new parallel processing architecture when the earlier design comprising of a single high performance computer failed to meet the requirements . The best supercomputer available today uses Symmetric Multi-Processing i.e. multiple processors within one box. These are expensive and use custom-made software. In contrast clusters use the common workstation software on all the workstations and distribute the workload to gain higher performance. These clusters can be designed for various purposes: They can be used for general high performance computing (Beowulf Class), or high performance web servers (MOSIX), or Virtual Parallel File-system (Grendal), or Virtual Shared Memory Supercomputers (SHRIMP and Locust) etc. Cluster use common hardware and can be used to reuse old hardware. This make clusters a much cheaper option than supercomputers. It is the good high price/performance ratio that has made these clusters popular.
What is Clustering?
Clustering is most widely recognized, as the ability to combine multiple systems in such a way that they provide services a single system could not. Clustering is used to achieve higher availability, scalability and easier management. Higher Availability can be achieved by use of "failover" clusters, in which resources can automatically move between 2 or more nodes in the event of a failure. Scalability can be achieved by balancing the load of an application across several computer systems. Simpler management can be attained through the use of virtual servers, as opposed to managing each individual computer system
What are High Availability Clusters?
High availability clustering joins together two or more servers to help ensure against system failures including planned shutdowns (e.g., maintenance, backups) and unplanned outages (e.g., system failure, software failure, operator errors). The group of connected systems is known as a cluster
BEOWULF HISTORY:
Beowulf was a cluster developed by Thomas Sterling of Center Excellence in Space Data and Flight Information Sciences(CESDIS) at NASA Goddard Space Flight Center[1],in 1994.That cluster employed 16 Intel 100MHz DX4 PCs each with 16 MB RAM and 2 Ethernet cards. It peaked at 60 MFlops as compared to 4.5 MFlops of each PC which is a degradation of 16% ,which is a very good result.
The idea was to use Commodity off the shelf (COTS) base components and build a cluster system to address a particular computational requirement for the ESS community. The Beowulf project was an instant success, demonstrating the concept of using a commodity cluster as an alternative and attractive choice for high-performance computing (HPC). Researchers within the HPC community now refer to such systems as High Performance Computing Clusters (HPCC). Nowadays, Beowulf systems or HPCC have been widely used for solving problems in various application domains. These applications range from high-end, floating-point intensive scientific and engineering problems to commercial data-intensive tasks. User employment of these applications includes seismic analysis for oil exploration, aerodynamic simulation for motor and aircraft design, molecular modeling for biomedical research, data mining or finance modeling for business analysis, and so much more.
DESIGN BASICS
A typical Beowulf system may comprise 16 nodes interconnected by 100 base T Fast Ethernet. Each node may include a single Intel Pentium II or Digital Alpha 21164PC processor, 128-512 MBytes of DRAM, 3-30 GBytes of EIDE disk, a PCI bus backplane, and an assortment of other devices. At least one node called master node will have video card, monitor, keyboard, CD-ROM, floppy drive and so forth. All the nodes in the cluster are commodity systems - PCs, workstations, or servers - running commodity software such as Linux-uses UDP protocols for transferring data. The master node acts as a server for Network File System (NFS) and as a gateway to the outside world. In order to make the master node highly available to users, High Availability (HA) clustering might be employed as shown in the figure.