Cluster Computing: Cluster Computing: You've Come A Long Way - PDF document

LCSC 5th Annual Workshop on Linux Clusters for Super Computing October 18-21, 2004 Linköping University, Sweden Cluster Computing: Cluster Computing: You've Come A Long Way You've Come A Long Way In A Short Time In A Short Time Jack Dongarra University of Tennessee and Oak Ridge National Laboratory 1 Vibrant Field for High Performance Vibrant Field for High Performance Computers Computers ♦ Coming soon … ♦ Cray X1 ♦ SGI Altix � Cray RedStorm � Cray BlackWidow ♦ IBM Regatta � NEC SX-8 ♦ IBM Blue Gene/L ♦ IBM eServer ♦ Sun ♦ HP ♦ Bull NovaScale ♦ Fujitsu PrimePower ♦ Hitachi SR11000 ♦ NEC SX-7 ♦ Apple 2 1

H. Meuer, H. Simon, E. Strohmaier, & JD H. Meuer, H. Simon, E. Strohmaier, & JD - Listing of the 500 most powerful Computers in the World - Yardstick: Rmax from LINPACK MPP Ax=b, dense problem TPP performance Rate - Updated twice a year Size SC‘xy in the States in November Meeting in Heidelberg, Germany in June - All data available from www.top500.org 3 Architecture/Systems Continuum Architecture/Systems Continuum Tightly Coupled Custom processor ♦ with custom interconnect 100% Cray X1 � NEC SX-7 � Custom IBM Regatta � IBM Blue Gene/L � 80% Commodity processor ♦ with custom interconnect SGI Altix � 60% � Intel Itanium 2 Cray Red Storm � Hybrid � AMD Opteron Commodity processor ♦ with commodity interconnect 40% Clusters � � Pentium, Itanium, Opteron, Alpha 20% � GigE, Infiniband, Myrinet, Quadrics Commod Loosely NEC TX7 � IBM eServer Coupled � 0% Bull NovaScale 5160 � J u n -9 3 D e c -9 3 J u n -9 4 D e c -9 4 J u n -9 5 D e c -9 5 J u n -9 6 D e c -9 6 J u n -9 7 D e c -9 7 J u n -9 8 D e c -9 8 J u n -9 9 D e c -9 9 J u n -0 0 D e c -0 0 J u n -0 1 D e c -0 1 J u n -0 2 D e c -0 2 J u n -0 3 D e c -0 3 4 J u n -0 4 2

I t is really difficult to tell when an exponential is happening… by the time you get enough data points, it is too late Larry Smarr 5 Top500 Performance by Manufacturer Intel June 2004 0% California Digital Corp. Linux Networx 2% 3% Others 5% Dell 3% Self-made 2% Hitachi 1% Fujitsu 2% Sun 1% SGI 3% IBM Cray Inc. 51% 2% NEC 6% HP 6 19% 3

The Golden Age of HPC Linux The Golden Age of HPC Linux ♦ The adoption rate of Linux HPC is phenomenal! � Linux in the Top500 is (was) doubling every 12 months � Linux adoption is not driven by bottom feeders � Adoption is actually faster at the ultra-scale! ♦ Most supercomputers run Linux ♦ Adoption rate driven by several factors: � Linux is stable: Often the default platform for CS research � Essentially no barrier to entry � Effort to learn programming paradigm, libs, devl env., and tools preserved across many orders of magnitude � Stable, complete, portable, middleware software stacks: � MPICH, MPI-IO, PVFS, PBS, math libraries, etc 7 Commodity Processors Commodity Processors ♦ HP PA RISC ♦ Intel Pentium Xeon ♦ Sun UltraSPARC IV � 3.2 GHz, peak = 6.4 Gflop/s � Linpack 100 = 1.7 Gflop/s ♦ HP Alpha EV68 � Linpack 1000 = 3.1 Gflop/s � 1.25 GHz, 2.5 Gflop/s peak ♦ AMD Opteron ♦ MIPS R16000 � 2.2 GHz, peak = 4.4 Gflop/s � Linpack 100 = 1.3 Gflop/s � Linpack 1000 = 3.1 Gflop/s ♦ Intel Itanium 2 � 1.5 GHz, peak = 6 Gflop/s � Linpack 100 = 1.7 Gflop/s 8 � Linpack 1000 = 5.4 Gflop/s 4

Commodity Interconnects Commodity Interconnects ♦ Gig Ethernet ♦ Myrinet Clos ♦ Infiniband ♦ QsNet F a t t r e e ♦ SCI T Cost Cost Cost MPI Lat / 1-way / Bi-Dir o r u Switch topology NIC Sw/node Node (us) / MB/s / MB/s s Gigabit Ethernet Bus $ 50 $ 50 $ 100 30 / 100 / 150 SCI Torus $1,600 $ 0 $1,600 5 / 300 / 400 QsNetII (R) Fat Tree $1,200 $1,700 $2,900 3 / 880 / 900 QsNetII (E) Fat Tree $1,000 $ 700 $1,700 3 / 880 / 900 Myrinet (D card) Clos $ 595 $ 400 $ 995 6.5 / 240 / 480 Myrinet (E card) Clos $ 995 $ 400 $1,395 6 / 450 / 900 9 IB 4x Fat Tree $1,000 $ 400 $1,400 6 / 820 / 790 How Big Is Big? How Big Is Big? ♦ Every 10X brings new challenges � 64 processors was once considered large � it hasn’t been “large” for quite a while � 1024 processors is today’s “medium” size � 2048-8096 processors is today’s “large” � we’re struggling even here ♦ 100K processor systems � are in construction � we have fundamental challenges … � … and no integrated research program 10 5

On the Horizon: 10K CPU SGI Columbia @NASA 10K CPU Cray Red Storm @Sandia 130K CPU IBM BG/L@LLNL First 10,000 CPU Linux Cluster Makes Top500 11 IBM BlueGene IBM BlueGene/L /L System (64 racks, 64x32x32) 131,072 procs Rack (32 Node boards, 8x8x16) 2048 processors BlueGene/L Compute ASIC Node Card (32 chips, 4x4x2) 16 Compute Cards 64 processors Compute Card 180/360 TF/s (2 chips, 2x1x1) 32 TB DDR 4 processors Chip (2 processors) 2.9/5.7 TF/s Full system total of 0.5 TB DDR 131,072 processors 90/180 GF/s 16 GB DDR 5.6/11.2 GF/s 2.8/5.6 GF/s 1 GB DDR 4 MB (cache) “Fastest Computer” BG/L 700 MHz 16K proc 8 racks Peak: 45.9 Tflop/s 12 Linpack: 36.0 Tflop/s 78% of peak 6

BlueGene/L Interconnection Networks BlueGene/L Interconnection Networks 3 Dimensional Torus Interconnects all compute nodes (65,536) � Virtual cut-through hardware routing � 1.4Gb/s on all 12 node links (2.1 GB/s per node) � 1 µ s latency between nearest neighbors, 5 µ s to the � farthest 4 µ s latency for one hop with MPI, 10 µ s to the � farthest Communications backbone for computations � 0.7/1.4 TB/s bisection bandwidth, 68TB/s total � bandwidth Global Tree Interconnects all compute and I/O nodes (1024) � One-to-all broadcast functionality � Reduction operations functionality � 2.8 Gb/s of bandwidth per link � Latency of one way tree traversal 2.5 µ s � ~23TB/s total binary tree bandwidth (64k machine) � Ethernet Incorporated into every node ASIC � Active in the I/O nodes (1:64) � All external comm. (file I/O, control, user � interaction, etc.) Low Latency Global Barrier and Interrupt Latency of round trip 1.3 µ s � 13 Control Network OS for IBM’ ’s BG/L s BG/L OS for IBM ♦ Service Node: ♦ Trend: � Extremely large � Linux SuSE SLES 8 systems run an “ OS ♦ Front End Nodes: Suite ” I/O Node � Functional � Linux SuSE SLES 9 Decomposition trend ♦ I/O Nodes: lends itself toward a customized, optimized � An embedded Linux point-solution OS ♦ Compute Nodes: � Hierarchical Organization requires Message Smart � Home-brew OS Processor software to manage Memory Vector Pipeline Vector Pipeline Vector Pipeline Vector Pipeline topology, call forwarding, and collective operations 14 7

Sandia National Lab National Lab’ ’s Red Storm s Red Storm Sandia • Red Storm is a supercomputer system leveraging over 10,000 AMD Opteron™ processors connected by an innovative high speed, high bandwidth 3D mesh interconnect designed by Cray. • Cray was awarded $93M to build the Red Storm system to support the Department of Energy's Nuclear stockpile stewardship program for advanced 3D modeling and simulation. • Scientists at Sandia National Lab helped with the architectural design of the Red Storm supercomputer. 15 Red Storm System Overview Red Storm System Overview • 40TF peak performance • 108 compute node cabinets, 16 service and I/O node cabinets, and 16 Red/Black switch cabinets – 10,368 compute processors - 2.0 GHz AMD Opteron™ – 512 service and I/O processors (256P for red, 256P for black) – 10 TB DDR memory • 240 TB of disk storage(120TB for red, 120TB for black) • MPP System Software – Linux + lightweight compute node operating system – Managed and used as a single system – Easy to use programming environment – Common programming environment – High performance file system – Low overhead RAS and message passing • Approximately 3,000 ft² including disk systems 16 8

Cluster Computing: Cluster Computing: You've Come A Long Way - PDF document

LCSC 5th Annual Workshop on Linux Clusters for Super Computing October 18-21, 2004 Linkping University, Sweden Cluster Computing: Cluster Computing: You've Come A Long Way You've Come A Long Way In A Short Time In A Short Time Jack

Come, Come Whoever You Are Come, Come, Whoever You Are Though youve broken your vows a

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

Introduction to Cluster Computing Brian Vinter vinter@diku.dk Overview Cluster Computing

Remote Memory Architectures Evolution Cluster Computing Communication Models Cluster Computing

Song of Songs Song of Solomon Song of Songs 6:13-8:4 (NIV) Ch Choru rus Come back, come back,

Youve Come A Long Way, Baby Youve Come A Long Way, Baby RAB, BMI & MIW Radio Group

history and drivers The Aerospace Cluster The Cluster-Association The Aerospace Cluster The

Getting started on the cluster Learning Objectives Describe the structure of a compute cluster

Computing Cluster Usage Visualization Tool Compu&ng Cluster Usage Visualiza&on

Computing Cluster Usage Visualization Tool Compu&ng Cluster Usage Visualiza&on

Build Your Cluster with Rocks Build Your Cluster with Rocks Yu Fu Yu Fu University of Florida

If you come to me and say you are going to help me, then go away. If you come to me and say

Advent O come, O come, Emmanuel And ransom captive Israel That mourns in lonely exile here

Counting and Probability Whats to come? Counting and Probability Whats to come?

Questions? Questions? Questions? Questions? Questions? Questions? Questions? Questions?

Bright Cluster Manager Advanced HPC cluster management made easy Martijn de Vries CTO Bright

1 PIER: 15-Letter Gain From Baseline PIER: Loss of < 15 Letters From Baseline at 12 Months

Instruction Parsers Nathan Jay Paradyn Project Scalable Tools Workshop Granlibakken, California

Multi-Architecture ISA-Level Simulation of OpenCL Dana Schaa, Rafael Ubal Northeastern

x86 basics ISA context and x86 history Translation tools: C --> assembly <--> machine

Online Phase-Adaptive Data Layout Selection Chengliang Zhang Martin Hirzel Microsoft IBM

Experiments at Scale: PRObE Garth Gibson, Carnegie Mellon University Gary Grider, Los Alamos

Analyst Day 2017 Real change starts here Doug Pferdehirt, Chief Executive Officer Disclaimer

Earnings Conference Call Second Quarter 2017 July 26, 2017 Cautionary Statements And Risk

Sambuz

Useful Links

Newsletter

Mail Us

Cluster Computing: Cluster Computing: You've Come A Long Way - PDF document

LCSC 5th Annual Workshop on Linux Clusters for Super Computing October 18-21, 2004 Linkping University, Sweden Cluster Computing: Cluster Computing: You've Come A Long Way You've Come A Long Way In A Short Time In A Short Time Jack

Come, Come Whoever You Are Come, Come, Whoever You Are Though youve broken your vows a

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

Introduction to Cluster Computing Brian Vinter vinter@diku.dk Overview Cluster Computing

Remote Memory Architectures Evolution Cluster Computing Communication Models Cluster Computing

Song of Songs Song of Solomon Song of Songs 6:13-8:4 (NIV) Ch Choru rus Come back, come back,

Youve Come A Long Way, Baby Youve Come A Long Way, Baby RAB, BMI &amp; MIW Radio Group

history and drivers The Aerospace Cluster The Cluster-Association The Aerospace Cluster The

Getting started on the cluster Learning Objectives Describe the structure of a compute cluster

Computing Cluster Usage Visualization Tool Compu&amp;ng Cluster Usage Visualiza&amp;on

Computing Cluster Usage Visualization Tool Compu&amp;ng Cluster Usage Visualiza&amp;on

Build Your Cluster with Rocks Build Your Cluster with Rocks Yu Fu Yu Fu University of Florida

If you come to me and say you are going to help me, then go away. If you come to me and say

Advent O come, O come, Emmanuel And ransom captive Israel That mourns in lonely exile here

Counting and Probability Whats to come? Counting and Probability Whats to come?

Questions? Questions? Questions? Questions? Questions? Questions? Questions? Questions?

Bright Cluster Manager Advanced HPC cluster management made easy Martijn de Vries CTO Bright

1 PIER: 15-Letter Gain From Baseline PIER: Loss of &lt; 15 Letters From Baseline at 12 Months

Instruction Parsers Nathan Jay Paradyn Project Scalable Tools Workshop Granlibakken, California

Multi-Architecture ISA-Level Simulation of OpenCL Dana Schaa, Rafael Ubal Northeastern

x86 basics ISA context and x86 history Translation tools: C --&gt; assembly &lt;--&gt; machine

Online Phase-Adaptive Data Layout Selection Chengliang Zhang Martin Hirzel Microsoft IBM

Experiments at Scale: PRObE Garth Gibson, Carnegie Mellon University Gary Grider, Los Alamos

Analyst Day 2017 Real change starts here Doug Pferdehirt, Chief Executive Officer Disclaimer

Earnings Conference Call Second Quarter 2017 July 26, 2017 Cautionary Statements And Risk

Sambuz

Useful Links

Newsletter

Mail Us

Youve Come A Long Way, Baby Youve Come A Long Way, Baby RAB, BMI & MIW Radio Group

Computing Cluster Usage Visualization Tool Compu&ng Cluster Usage Visualiza&on

Computing Cluster Usage Visualization Tool Compu&ng Cluster Usage Visualiza&on

1 PIER: 15-Letter Gain From Baseline PIER: Loss of < 15 Letters From Baseline at 12 Months

x86 basics ISA context and x86 history Translation tools: C --> assembly <--> machine