23-03-2011, 03:26 PM
Presented by;
Sunitha M. Jenarius
BlueGene.ppt (Size: 971 KB / Downloads: 114)
BLUE GENE
What is Blue Gene
A massively parallel supercomputer using tens of thousands of embedded PowerPC processors supporting a large memory space
With standard compilers and message passing environment
Why the name “Blue Gene”?
“Blue”: The corporate color of IBM
“Gene”: The intended use of the Blue Gene clusters – Computational biology, specifically, protein folding
History
Dec’99, IBM Research announced $100M US effort to build a Petaflop scale supercomputer.
Two goals of The Blue Gene project :
– Massively parallel machine architecture and software
– Bio-Molecular Simulation – advance orders of magnitude
November 2001, Partnership with Lawrence Livermore National Laboratory (LLNL)
and this resulted in …
Results
Linpack Top 500 Supercomputers
Blue Gene Projects
Four Blue Gene projects :
– BlueGene/L
– BlueGene/C
– BlueGene/P
– BlueGene/Q
Blue Gene/L
The first computer in the Blue Gene series
IBM first announced the Blue Gene/L project, Sept. 29, 2004
Final configuration was launched in October 2005
– Blue Gene/L - Unsurpassed Performance
Designed to deliver the most performance per kilowatt of power consumed
Theoretical peak performance of 360 TFLOPS
Final Configuration (Oct. ‘05) scores over 280 TFLOPS sustained on the Linpack benchmark.
Nov 14, ‘06, at Supercomputing 2006, Blue Gene/L was awarded the winning prize in all HPC Challenge Classes of awards.
Blue Gene/L Architecture
Can be scaled up to 65,536 compute or I/O nodes, with 131,072 processors
Each node is a single ASIC with associated DRAM memory chips
Each ASIC has 2 700 MHz IBM PowerPC processors
PowerPC processors
– Low-frequency, low-power embedded processors, superior to today's high-frequency, high-power microprocessors by a factor of 2 or more
– Double-pipeline-double-precision Floating Point Unit
– A cache sub-system with built-in DRAM controller
Node CPUs are not cache coherent with one another
FPUs and CPUs are designed for low power consumption
– Using transistors with low leakage current
– Local clock gating
– Putting the FPU or CPU/FPU pair to sleep
1 rack holds 1024 nodes or 2048 processors
Nodes optimized for low power consumption
ASIC based on System-on-a-chip technology
– Large numbers of low-power system-on-a-chip technology allows it to outperform commodity clusters while saving on power
– Aggressive packaging of processors, memory and interconnect
– Power Efficient & Space Efficient
– Allows for latencies and bandwidths that are significantly better than those for nodes typically used in ASC scale supercomputers
Blue Gene/L Networks
Each node is attached to 3 main parallel communication networks
– 3D Torus network - peer-2-peer between compute nodes
– Collective network – collective & global communication
– Ethernet network - I/O and management (such as access to any node for configuration, booting and diagnostics )
Blue Gene/L System Software
System software supports efficient execution of parallel applications
Compiler support for DFPU (C, C++, Fortran)
Compute nodes use a minimal operating system called “BlueGene/L compute node kernel”
– A lightweight, single-user operating system
– Supports execution of a single dual-threaded application compute process
– Kernel provides a single and static virtual address space to one running compute process
– Because of single-process nature, no context switching required
– Blue Gene/L System Software contd…
To allow multiple programs to run concurrently
– Blue Gene/L system can be partitioned into electronically isolated sets of nodes
– The number of nodes in a partition must be a positive integer power of 2
– To run program – reserve this partition
– No other program can use till partition is done with current program
– With so many nodes, component failures are inevitable. The system is able to electrically isolate faulty hardware to allow the machine to continue to run
Parallel Programming model
– Message Passing – supported through an implementation of MPI
– Only a subset of POSIX calls are supported
– Green threads are also used to simulate local concurrency
Blue Gene/C
Sister-project to BlueGene/L
Renamed to Cyclops64
Massively parallel, supercomputer-on-a-chip cellular architecture
Cellular architecture gives the programmer the ability to run large numbers of concurrent threads within a single processor.
Blue Gene/P
Architecturally similar to BlueGene/L
Expected to operate around one petaflop
Expected around 2008
Blue Gene/Q
Last known supercomputer in the Blue Gene series
Expected to reach 3-10 petaflops
Sunitha M. Jenarius
BlueGene.ppt (Size: 971 KB / Downloads: 114)
BLUE GENE
What is Blue Gene
A massively parallel supercomputer using tens of thousands of embedded PowerPC processors supporting a large memory space
With standard compilers and message passing environment
Why the name “Blue Gene”?
“Blue”: The corporate color of IBM
“Gene”: The intended use of the Blue Gene clusters – Computational biology, specifically, protein folding
History
Dec’99, IBM Research announced $100M US effort to build a Petaflop scale supercomputer.
Two goals of The Blue Gene project :
– Massively parallel machine architecture and software
– Bio-Molecular Simulation – advance orders of magnitude
November 2001, Partnership with Lawrence Livermore National Laboratory (LLNL)
and this resulted in …
Results
Linpack Top 500 Supercomputers
Blue Gene Projects
Four Blue Gene projects :
– BlueGene/L
– BlueGene/C
– BlueGene/P
– BlueGene/Q
Blue Gene/L
The first computer in the Blue Gene series
IBM first announced the Blue Gene/L project, Sept. 29, 2004
Final configuration was launched in October 2005
– Blue Gene/L - Unsurpassed Performance
Designed to deliver the most performance per kilowatt of power consumed
Theoretical peak performance of 360 TFLOPS
Final Configuration (Oct. ‘05) scores over 280 TFLOPS sustained on the Linpack benchmark.
Nov 14, ‘06, at Supercomputing 2006, Blue Gene/L was awarded the winning prize in all HPC Challenge Classes of awards.
Blue Gene/L Architecture
Can be scaled up to 65,536 compute or I/O nodes, with 131,072 processors
Each node is a single ASIC with associated DRAM memory chips
Each ASIC has 2 700 MHz IBM PowerPC processors
PowerPC processors
– Low-frequency, low-power embedded processors, superior to today's high-frequency, high-power microprocessors by a factor of 2 or more
– Double-pipeline-double-precision Floating Point Unit
– A cache sub-system with built-in DRAM controller
Node CPUs are not cache coherent with one another
FPUs and CPUs are designed for low power consumption
– Using transistors with low leakage current
– Local clock gating
– Putting the FPU or CPU/FPU pair to sleep
1 rack holds 1024 nodes or 2048 processors
Nodes optimized for low power consumption
ASIC based on System-on-a-chip technology
– Large numbers of low-power system-on-a-chip technology allows it to outperform commodity clusters while saving on power
– Aggressive packaging of processors, memory and interconnect
– Power Efficient & Space Efficient
– Allows for latencies and bandwidths that are significantly better than those for nodes typically used in ASC scale supercomputers
Blue Gene/L Networks
Each node is attached to 3 main parallel communication networks
– 3D Torus network - peer-2-peer between compute nodes
– Collective network – collective & global communication
– Ethernet network - I/O and management (such as access to any node for configuration, booting and diagnostics )
Blue Gene/L System Software
System software supports efficient execution of parallel applications
Compiler support for DFPU (C, C++, Fortran)
Compute nodes use a minimal operating system called “BlueGene/L compute node kernel”
– A lightweight, single-user operating system
– Supports execution of a single dual-threaded application compute process
– Kernel provides a single and static virtual address space to one running compute process
– Because of single-process nature, no context switching required
– Blue Gene/L System Software contd…
To allow multiple programs to run concurrently
– Blue Gene/L system can be partitioned into electronically isolated sets of nodes
– The number of nodes in a partition must be a positive integer power of 2
– To run program – reserve this partition
– No other program can use till partition is done with current program
– With so many nodes, component failures are inevitable. The system is able to electrically isolate faulty hardware to allow the machine to continue to run
Parallel Programming model
– Message Passing – supported through an implementation of MPI
– Only a subset of POSIX calls are supported
– Green threads are also used to simulate local concurrency
Blue Gene/C
Sister-project to BlueGene/L
Renamed to Cyclops64
Massively parallel, supercomputer-on-a-chip cellular architecture
Cellular architecture gives the programmer the ability to run large numbers of concurrent threads within a single processor.
Blue Gene/P
Architecturally similar to BlueGene/L
Expected to operate around one petaflop
Expected around 2008
Blue Gene/Q
Last known supercomputer in the Blue Gene series
Expected to reach 3-10 petaflops