An Overview Of High Performance Computing And Challenges For The - PowerPoint PPT Presentation

An Overview Of High Performance Computing And Challenges For The Future Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 2/13/2009 1

A Growth-Factor of a Billion Super Scalar/Special Purpose/Parallel in Performance in a Career 1 PFlop/s IBM (10 15 ) Parallel RoadRunner 2X Transistors/Chip Cray Jaguar Every 1.5 Years ASCI White ASCI Red Pacific 1 TFlop/s (10 12 ) TMC CM-5 Cray T3D Vector TMC CM-2 Cray 2 1 GFlop/s Cray X-MP Super Scalar (10 9 ) Cray 1 1941 1 (Floating Point operations / second, Flop/s) CDC 7600 IBM 360/195 1945 100 Scalar 1 MFlop/s 1949 1,000 (1 KiloFlop/s /s, , KFlop/s) 1951 10,000 (10 6 ) CDC 6600 1961 100,000 1964 1,000,000 (1 MegaFlop/s, , MFlop/s) IBM 7090 1968 10,000,000 1975 100,000,000 1987 1,000,000,0 ,000 (1 GigaFlop/s /s, , GFlop/s /s) 1992 10,000,000,0 ,000 1993 100,000,000,0 ,000 1 KFlop/s 1997 1,000,000,0 ,000,0 ,000 00 (1 TeraFlop/s /s, , TFlop/s) (10 3 ) UNIVAC 1 2000 10,000,000,0 ,000,00 ,000 EDSAC 1 2007 2007 478,000,000,0 478,000,000,000, 00,000 000 (478 478 Tflop/s) 07 2009 1,100,000,0 ,000,0 ,000 00,0 ,000 (1.1 PetaFlop/s /s) 2 1950 1960 1970 1980 1990 2000 2010

H. Meuer, H. Simon, E. Strohmaier, & JD - Listing of the 500 most powerful Computers in the World - Yardstick: Rmax from LINPACK MPP Ax=b, dense problem TPP performance Rate - Updated twice a year Size SC‘xy in the States in November Meeting in Germany in June - All data available from www.top500.org 07 3

Performance Development 100 Pflop/s 100000000 16.9 10 Pflop/s 10000000 1.1 PFlop/s 1 Pflop/s 1000000 SUM 100 Tflop/s 100000 12.6 TFlop/s 10 Tflop/s N=1 10000 1.17 1 Tflop/s 6-8 years 1000 100 Gflop/s 100 59.7 GFlop/s 10 Gflop/s N=500 10 My Laptop 1 Gflop/s 1 400 MFlop/s 100 Mflop/s 0.1 1994 1996 1998 2000 2002 2004 2006 2008

Performance Development and Projections ~1000 Year ~1 Year ~8 Hours ~1 Min. 1.E+19 Eflop/s 1.E+18 SUM 1.E+17 16.9 PFlop/s N=1 1.E+16 Pflop/s 1.E+15 1.1 PFlop/s 1.E+14 12.6 TFlop/s 1.E+13 1.17 TFlop/s Tflop/s N=500 1.E+12 1.E+11 59.7 GFlop/s 1.E+10 Gflop/s 1.E+09 400 MFlop/s 1.E+08 1.E+07 1.E+06 1.E+05 Cray 2 ASCI Red RoadRunner 1 Gflop/s 1 Tflop/s 1.1 Pflop/s 1 Eflop/s O(1) Thread O(10 3 ) Threads O(10 6 ) Threads O(10 9 ) Threads

Processors / Systems Xeon E54xx (Harpertown) 2% 2% Xeon 51xx (Woodcrest) 3% Xeon 53xx (Clovertown) 37% Xeon L54xx (Harpertown) 6% Opteron Quad Core 6% Opteron Dual Core PowerPC 440 7% PowerPC 450 POWER6 13% Intel 71% 14% AMD 13% Others IBM 7%

Cluster Interconnects 300 250 200 GigE 150 Myrinet Infiniband 100 Quadrics 50 0 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

Efficiency 1.00 0.90 0.80 0.70 Effeciency 0.60 0.50 0.40 0.30 0.20 0.10 0.00 0 100 200 300 400 500 TOP500 Ranking

Cores Per Socket 500 400 300 Systems 9 4 200 2 1 4 cores: 67% 100 2 cores: 31% 9 cores: 7 systems Single core: 4 systems 0

Core Count 500 1 2 "4-7" 400 "8-15" 16-31 17-32 300 33-64 Systems 65-128 129-256 200 257-512 513-1024 1025-2048 2049-4096 100 4k-8k 8k-16k 16k-32k 0 32k-64k 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 64k-128k 128k-

Countries / System Share 2% 1% 1% 58% United States 2% 2% 6% 9% United Kingdom 2% 5% France 3% 5% Germany 4% Japan 4% 3% China 5% 2% Italy 2% 58% Sweden 5% 2% India 2% 9% Russia 1% Spain 1% Poland

Customer Segments 500 400 Others Government 300 Systems Vendor Classified 200 Academic Research 100 Industry 0 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

Distribution of the Top500 1200 1.1 Pflop/s 1100 1000 2 systems > 1 Pflop/s 900 800 Rmax (Tflop/s) 700 19 systems > 100 Tflop/s 600 500 400 51 systems > 50 Tflop/s 300 119 systems > 25 Tflop/s 200 100 12.6 Tflop/s 0 1 27 53 79 105 131 157 183 209 235 261 287 313 339 365 391 417 443 469 495 Rank

Replacement Rate 350 300 250 267 200 150 100 50 0 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

32 nd List: The TOP10 Rmax Rmax/ Power Rank Site Computer Country Cores [MW] MF/W [Tflops] Rpeak IBM / Roadrunner - 1 DOE/NNSA/LANL USA 129600 1105.0 76% 2.48 445 BladeCenter QS22/LS21 DOE/Oak Ridge Cray / Jaguar - Cray XT5 2 USA 150152 1059.0 77% 6.95 152 National Laboratory QC 2.3 GHz NASA/Ames Research SGI / Pleiades - SGI Altix 3 USA 51200 487.0 80% 2.09 233 Center/NAS ICE 8200EX IBM / eServer Blue Gene 4 DOE/NNSA/LLNL USA 212992 478.2 80% 2.32 205 Solution DOE/Argonne National 5 IBM / Blue Gene/P Solution USA 163840 450.3 81% 1.26 357 Laboratory NSF/Texas Advanced Sun / Ranger - SunBlade 6 Computing USA 62976 75% 2.0 217 433.2 x6420 Center/Univ. of Texas 7 DOE/NERSC/LBNL Cray / Franklin - Cray XT4 USA 38642 266.3 75% 1.15 232 DOE/Oak Ridge 8 Cray / Jaguar - Cray XT4 USA 30976 205.0 79% 1.58 130 National Laboratory DOE/NNSA/Sandia 9 Cray / Red Storm - XT3/4 USA 38208 204.2 72% 2.5 81 National Laboratories Shanghai Dawning 5000A, Windows 10 China 30720 180.6 77% - - Supercomputer Center HPC 2008

LANL Roadrunner A Petascale System in 2008 ≈ 13,000 Cell HPC chips “Connected Unit” cluster ≈ 1.33 PetaFlo Flop/s /s (from Cell) 192 Opteron nodes (180 w/ 2 dual-Cell blades ≈ 7,000 dual -core Opterons connected w/ 4 PCIe x8 ≈ 122,000 cores links) 17 clusters 2 nd stage InfiniBand 4x DDR interconnect Cell chip for each core (18 sets of 12 links to 8 switches) 2 nd stage InfiniBand interconnect (8 switches) Based on the 100 Gflop/s /s (DP) ) Cell chip Hybrid Design (2 kinds of chips & 3 kinds of cores) Programming required at 3 levels. Dual Core Opteron Chip

ORNL’s Newest System Jaguar XT5 The systems will be combined after Jaguar Total XT5 XT4 acceptance of the new Peak Performance 1,645 1,382 263 XT5 upgrade. Each system will be linked AMD Opteron Cores 181,504 150,17 31,328 to the file system 6 through 4x-DDR Infiniband System Memory (TB) 362 300 62 Disk Bandwidth (GB/s) 284 240 44 Disk Space (TB) 10,750 10,000 750 Interconnect Bandwidth (TB/s) 532 374 157 Offic ice of Scienc ience

’s HPC System  University of Tennessee’s  National Institute for Computational Sciences  Housed at ORNL  Operated for the NSF  Named Kraken Today:  Cray XT5 (608 TF) + Cray XT4 (167 TF)  XT5: 16,512 sockets, 66,048 cores  XT4: 4,512 sockets, 18,048 cores  Number 15 on the Top500 19

Power is an Industry Wide Problem Google facilities  leveraging hydroelectric power  old aluminum plants “Hiding in Plain Sight, Google Seeks More Power”, by John Markoff, June 14, 2006 Micros osoft oft and Yahoo oo are buildi ding ng big data center ers Micro rosoft soft Quincy, y, Wash. upstr tream eam in Wenatchee ee and Quincy, , Wash. – To keep up with Google, which means they need cheap 470,00 000 Sq Ft, 47MW! 20 electricity and readily accessible data networking

ORNL/UTK Power Cost Projections 2007-2011 Over the next 5 years ORNL/UTK will deploy 2 large Petascale systems Using 4 MW today, going to 15MW before year end By 2012 could be using more than 50MW!! Cost estimates based on $0.07 per KwH Includes both DOE and NSF systems.

Something’s Happening Here… From K. Olukotun, L. Hammond, H. • In the “old Sutter, and B. Smith days” it was: each year A hardwar dware e issue ue just beca came me a processors software ware probl oblem em would become faster • Today the clock speed is fixed or getting slower Things are still • doubling every 18 -24 months Moore’s Law • reinterpretated. Number of cores  double every 18-24 months 07 22

Power Cost of Frequency ge 2 x Freq er ∝ Vol • Pow ower olta tage eque uenc ncy (V (V 2 F) F) • Frequency ∝ Vol olta tage ge er ∝ Fr • Po Power Freq eque uenc ncy 3 23

An Overview Of High Performance Computing And Challenges For The - PowerPoint PPT Presentation

An Overview Of High Performance Computing And Challenges For The Future Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 2/13/2009 1 A Growth-Factor of a Billion Super Scalar/Special

Eagle Scholars: High Eagle Scholars: High Eagle Scholars: High Eagle Scholars: High Eagle

No CDN On-net Off-net Deep off-net User Experience Low Medium High Very High

01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 |

Glendale Union High School District Honors / Advanced Placement Information Night Our Schools

CAL IF ORNIA HIGH- - SPE SPE E D RAIL CAL IF ORNIA HIGH E D RAIL CAL IF ORNIA HIGH-

Drupal High Availability High Performance Samstag, 3. November 12 Drupal High Availability

SOHO SOHO SOHO SOHO HIGH RISE HIGH RISE HIGH RISE HIGH RISE CONDOMINIUMS CONDOMINIUMS

Smith-Cotton High School By: Malacki Ehlers Sedalia High School Before the Smith-Cotton High

High School Graduation Requirements: Transition to High School Princeton High School Thursday,

High School & Junior High School Aerial High School Junior High School

UN High UN High UN High UN High- - - -Level Meeting on TB Level Meeting on TB Level Meeting

Everman Early College High School Introduction to Everman Early College High School Early

High quality high access: the g q y g challenge Sue Willis www. monash .edu High access?

Centinela Valley Union High Centinela Valley Union High School District Measure CV Program

Transitioning to High School Transitioning to High School Graduation Graduation & A &

High altitude Headache July 17.2010 KCGMH LianHui Lee High altitude Headache I. High

Decision on Governance Review Committee Membership Peter Colussy Manager, Regional Affairs

Task 1d: River basin management Task leader: LNEC; Involved partners EU: ISPRA, DTU, EWA Task

Noncoal Fee Data August 12, 2015 Existing Fee Schedule Proposed August 2010 Finalized October

Jackson Miracle Building Plan Outreach Roundtable November 1, 2016 $1.8 billion Expanded

Windward Community College University of Hawaii 2009-2010 1 2 Build Build a R a Roc

Seismic Monitoring of Rivers and Streams Proposal Senior Design Group 142 Tuesday,

A Mathematically Verified Device I 2 C Driver Using ASD Herman Roebbers Nov-2-2009 How to

Andrew Frey Western Washington University Electronics Engineering Technology 2010 MCU:

Sambuz

Useful Links

Newsletter

Mail Us

An Overview Of High Performance Computing And Challenges For The - PowerPoint PPT Presentation

An Overview Of High Performance Computing And Challenges For The Future Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 2/13/2009 1 A Growth-Factor of a Billion Super Scalar/Special

Eagle Scholars: High Eagle Scholars: High Eagle Scholars: High Eagle Scholars: High Eagle

No CDN On-net Off-net Deep off-net User Experience Low Medium High Very High

01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 |

Glendale Union High School District Honors / Advanced Placement Information Night Our Schools

CAL IF ORNIA HIGH- - SPE SPE E D RAIL CAL IF ORNIA HIGH E D RAIL CAL IF ORNIA HIGH-

Drupal High Availability High Performance Samstag, 3. November 12 Drupal High Availability

SOHO SOHO SOHO SOHO HIGH RISE HIGH RISE HIGH RISE HIGH RISE CONDOMINIUMS CONDOMINIUMS

Smith-Cotton High School By: Malacki Ehlers Sedalia High School Before the Smith-Cotton High

High School Graduation Requirements: Transition to High School Princeton High School Thursday,

High School &amp; Junior High School Aerial High School Junior High School

UN High UN High UN High UN High- - - -Level Meeting on TB Level Meeting on TB Level Meeting

Everman Early College High School Introduction to Everman Early College High School Early

High quality high access: the g q y g challenge Sue Willis www. monash .edu High access?

Centinela Valley Union High Centinela Valley Union High School District Measure CV Program

Transitioning to High School Transitioning to High School Graduation Graduation &amp; A &amp;

High altitude Headache July 17.2010 KCGMH LianHui Lee High altitude Headache I. High

Decision on Governance Review Committee Membership Peter Colussy Manager, Regional Affairs

Task 1d: River basin management Task leader: LNEC; Involved partners EU: ISPRA, DTU, EWA Task

Noncoal Fee Data August 12, 2015 Existing Fee Schedule Proposed August 2010 Finalized October

Jackson Miracle Building Plan Outreach Roundtable November 1, 2016 $1.8 billion Expanded

Windward Community College University of Hawaii 2009-2010 1 2 Build Build a R a Roc

Seismic Monitoring of Rivers and Streams Proposal Senior Design Group 142 Tuesday,

A Mathematically Verified Device I 2 C Driver Using ASD Herman Roebbers Nov-2-2009 How to

Andrew Frey Western Washington University Electronics Engineering Technology 2010 MCU:

Sambuz

Useful Links

Newsletter

Mail Us

High School & Junior High School Aerial High School Junior High School

Transitioning to High School Transitioning to High School Graduation Graduation & A &