high performance computing high performance computing
play

High Performance Computing, High Performance Computing, - PDF document

T h e 1 4 t h S ym p os iu m on C om p u t e r A r c h it e c t u r e a n d H ig h P e r f or m a n c e C om p u t in g V it o r ia / E S - B r a z il - O c t ob e r 2 8 -3 0 , 2 0 0 2 High Performance Computing, High Performance


  1. T h e 1 4 t h S ym p os iu m on C om p u t e r A r c h it e c t u r e a n d H ig h P e r f or m a n c e C om p u t in g V it o r ia / E S - B r a z il - O c t ob e r 2 8 -3 0 , 2 0 0 2 High Performance Computing, High Performance Computing, Computational Grid, and Numerical Libraries Computational Grid, and Numerical Libraries Jack Dongarra I nnovative Computing Lab University of Tennessee ht t p:/ / ht t p:/ / www.cs.ut k.edu/ ~dongarra www.cs.ut k.edu/ ~dongarra/ 1 Technology Trends: Technology Trends: Microprocessor Capacity Microprocessor Capacity Moore’s Law 2X transistors/Chip Every 1.5 years Gordon Moore (co-founder of Called “Moore’s Law” Intel) predicted in 1965 that the transistor density of semiconductor Microprocessors have chips would double roughly every become smaller, denser, 18 months. and more powerful. Not just processors, 2 bandwidth, storage, etc 1

  2. Moore’s Law Moore’s Law Super Scalar/Vector/Parallel 2010 1 PFlop/s 2005 Parallel ASCI White ASCI Red Pacific 1 TFlop/s TMC CM-5 Cray T3D Vector TMC CM-2 Cray 2 1 GFlop/s Cray X-MP Super Scalar Cray 1 CDC 7600 IBM 360/195 1 MFlop/s Scalar CDC 6600 IBM 7090 1 KFlop/s UNIVAC 1 EDSAC 1 3 1950 1960 1970 1980 1990 2000 2010 H. Meuer, H. Simon, E. Strohmaier, & JD H. Meuer, H. Simon, E. Strohmaier, & JD - Listing of the 500 most powerful Computers in the World - Yardstick: Rmax from LINPACK MPP Ax=b, dense problem TPP performance Rate - Updated twice a year Size SC‘xy in the States in November Meeting in Mannheim, Germany in June - All data available from www.top500.org 4 2

  3. X Y ( S c a t t e r ) 1 Fastest Computer Over Time I n 1980 a computation that 70 took 1 f ull year to complete can now be done in ~ 10 60 hours! 50 GFlop/s 40 30 TMC Cray CM-2 20 Y-MP (2048) (8) 10 Fujitsu VP-2600 0 1990 1992 1994 1996 1998 2000 Year 5 X Y ( S c a t t e r ) 1 Fastest Computer Over Time I n 1980 a computation that 700 took 1 f ull year to complete can now be done in ~ 16 Hitachi 600 minutes! CP- 500 PACS GFlop/s (2040) 400 TMC CM-5 Intel 300 NEC (1024) Paragon SX-3 (6788) 200 (4) Fujitsu 100 VPP-500 TMC Fujitsu CM-2 Cray (140) VP-2600 (2048) Y-MP (8) 0 1990 1992 1994 1996 1998 2000 Year 6 3

  4. X Y ( S c a t t e r ) 1 Fastest Computer Over Time I n 1980 a computation that ASCI White 7000 took 1 f ull year to complete Pacific (7424) can today be done in ~ 27 6000 seconds! Intel ASCI ASCI Red Xeon 5000 Blue (9632) GFlop/s Pacific 4000 SST (5808) 3000 Intel ASCI Red 2000 (9152) SGI ASCI Blue Hitachi Intel 1000 Fujitsu TMC CP-PACS Paragon NEC TMC VPP-500 Mountain CM-5 (6788) (2040) Fujitsu SX-3 Cray CM-2 (140) (1024) VP-2600 (4) Y-MP (8) (5040) (2048) 0 1990 1992 1994 1996 1998 2000 Year 7 XY (Scatter) 1 Fastest Computer Over Time I n 1980 a computation that 70 Japanese took 1 f ull year to complete Earth Simulator can today be done in ~ 5.4 60 seconds! NEC 5104 50 TFlop/s 40 30 20 ASCI White Intel ASCI Pacific Intel ASCI ASCI Blue Red Xeon Intel Hitachi (7424) Fujitsu 10 TMC Red Mountain (9632) Paragon CP-PACS NEC TMC VPP-500 CM-5 (2040) (9152) (5040) (6788) Fujitsu SX-3 Cray CM-2 (140) (1024) VP-2600 (4) Y-MP (8) (2048) 0 1990 1992 1994 1996 1998 2000 2002 Year 8 4

  5. Machines at the Top of the List Machines at the Top of the List Factor ? Factor ? Year Computer Measure Theoretica Number of Efficiency from from Pervious d l Processors Pervious Year Gflop/s Peak Year Gflop/s 2002 Earth Simulator 35860 5.0 40960 3.7 5120 88% Computer, NEC 2001 ASCI White-Pacific, 7424 65% 7226 1.5 11136 1.0 IBM SP Power 3 2000 ASCI White-Pacific, 4938 2.1 11136 3.5 7424 44% IBM SP Power 3 1999 ASCI Red Intel Pentium 9632 74% 2379 1.1 3207 0.8 II Xeon core 1998 ASCI Blue-Pacific SST, 5808 55% 2144 1.6 3868 2.1 IBM SP 604E 1997 Intel ASCI Option Red 9152 73% 1338 3.6 1830 3.0 (200 MHz Pentium Pro) 1996 Hitachi CP-PACS 368.2 1.3 614 1.8 2048 60% 1995 Intel Paragon XP/S MP 6768 83% 281.1 1 338 1.0 1994 Intel Paragon XP/S MP 281.1 2.3 338 1.4 6768 83% 9 1993 Fujitsu NWT 124.5 236 140 53% ’ Force A Tour d d ’ Force in Engineering in Engineering A Tour ♦ Homogeneous, Cent ralized, Proprietary, Expensive! ♦ Target Application: CFD- Weat her, Climate, Earthquakes ♦ 640 NEC SX/ 6 Nodes (mod) � 5120 CPUs which have vector ops ♦ 40TeraFlops (peak) ♦ $250- $ 500 million f or t hings in building ♦ Footprint of 4 tennis courts ♦ 7 MWat t s � Say 10 cent/ K Whr - $16.8K/ day = $6M/ year! ♦ Expect to be on top of Top500 until 60- 100 TFlop ASCI machine arrives ♦ For the Top500 (June 2002) � Equivalent ~ 1/ 6 S Top 500 � Perf ormance of ESC > S Next Top 12 Computers � S of all the DOE computers = 27. 5 TFlop/ s � Perf ormance of ESC > All the DOE + DOD machines 10 (37. 2 TFlop/ s) 5

  6. Top10 of the Top500 Top10 of the Top500 R max Area of Rank Manufacturer Computer Installation Site Country Year Installation # Proc [TF/s] 1 NEC Earth-Simulator 35.86 Earth Simulator Center Japan 2002 Research 5120 IBM ASCI White 7.23 Lawrence Livermore USA 2000 Research 8192 2 SP Power3 National Laboratory AlphaServer SC Pittsburgh 3 HP 4.46 USA 2001 Academic 3016 ES45 1 GHz Supercomputing Center 4 AlphaServer SC Commissariat a l’Energie HP 3.98 France 2001 Research 2560 ES45 1 GHz Atomique (CEA) 5 SP Power3 IBM 3.05 NERSC/LBNL USA 2001 Research 3328 375 MHz 6 HP AlphaServer SC 2.92 Los Alamos USA 2002 Research 2048 ES45 1 GHz National Laboratory 7 Intel ASCI Red 2.38 Sandia National Laboratory USA 1999 Research 9632 8 IBM pSeries 690 2.31 Oak Ridge USA 2002 Research 864 1.3 GHz National Laboratory 9 IBM ASCI Blue Pacific 2.14 Lawrence Livermore USA 1999 Research 5808 SST, IBM SP 604e National Laboratory pSeries 690 IBM/US Army Reseach Lab 10 IBM 2.00 USA 2002 Vendor 768 1.3 Ghz (ARL) 11 TOP500 - TOP500 - Performance Performance 1 Pflop/s 220 TF/s SUM 100 Tflop/s 35.8 TF/s 10 Tflop/s 1.17 TF/s N=1 NEC 1 Tflop/s Earth Simulator IBM ASCI White Intel ASCI Red 59.7 GF/s LLNL 100 Gflop/s Sandia Fujitsu 134 GF/s N=500 'NWT' NAL 10 Gflop/s 0.4 GF/s 1 Gflop/s My Laptop 100 Mflop/s 3 3 4 4 5 5 6 6 7 7 8 8 9 9 0 0 1 1 2 9 9 9 9 9 9 9 9 9 9 9 0 0 0 0 9 9 9 0 - - - - - - - - - - - - - - - - - - - n v n v n v n v n v n v n v n v n v n u o u o u o u o u o u o u o u o u o u J N J N J N J N J N J N J N J N J N J 12 6

  7. Performance Extrapolation Performance Extrapolation 10 PFlop/s 1 PFlop/s ASCI Purple 100 TFlop/s Earth Simulator 10 TFlop/s Sum 1 TFlop/s N=1 100 GFlop/s 10 GFlop/s 1 GFlop/s My Laptop N=500 100 MFlop/s Jun-94 Jun-99 Jun-03 Jun-06 Jun-08 Jun-93 Jun-95 Jun-96 Jun-97 Jun-98 Jun-00 Jun-01 Jun-02 Jun-04 Jun-05 Jun-07 Jun-09 Jun-10 13 Manufacturers Manufacturers 500 Hitachi others NEC Fujitsu HP 400 Sun Intel 300 TMC IBM 200 SGI 100 Cray 0 Jun-93 Nov-93 Nov-94 Jun-95 Nov-95 Jun-96 Nov-96 Nov-97 Nov-98 Jun-99 Nov-99 Jun-00 Nov-00 Nov-01 Jun-02 Jun-94 Jun-97 Jun-98 Jun-01 14 HP 168 (12 < 100), IBM 164 (47 < 100) 7

  8. Architectures Architectures Cluster - NOW SIMD 500 CM2 Cluster of Sun HPC Constellation Paragon 400 T3D CM5 MPP T3E 300 SP2 ASCI Red Y-MP C90 200 SX3 Sun HPC 100 SMP Single VP500 Processor 0 Jun-93 Jun-94 Nov-94 Jun-95 Jun-96 Nov-96 Jun-97 Jun-98 Jun-99 Nov-99 Jun-00 Jun-01 Nov-01 Jun-02 Nov-93 Nov-95 Nov-97 Nov-98 Nov-00 15 Constellation: # of p/n � n Kflops per Inhabitant Kflops per Inhabitant 700 643 600 500 450 358 400 245 283 300 207 203 158 200 141 67 76 100 0 n A K y e a S U l y a a c g p U n d i n t I r a a v n a u a J m a r o n F l r r b e i e d z m G n t i e a w c x S u S L White is ES contribution and Blue is ASCI contribution 16 8

  9. 80 Clusters on the Top500 80 Clusters on the Top500 ♦ A total of 42 I ntel based and 8 AMD based PC clusters are in the TOP500. � 31 of these I ntel based cluster are I BM Netf inity systems delivered by I BM. ♦ A substantial part of these are installed at industrial customers especially in the oil- industry. � I ncluding 5 Sun and 5 Alpha based clusters and 21 HP AlphaServer. ♦ 14 of these clusters are labeled as ' Self - Made' . 17 Cluster on the Top500 Cluster on the Top500 Processor Breakdown 80 AMD 70 Intel AMD, 8, 10% Sparc, 5, 6% 60 IBM Netfinity Alpha, 25, 31% Pentium 4, 3, Alpha 50 4% HP Alpha Server 40 Sparc Pentium III, Itanium, 2, 3% 37, 46% 30 20 10 0 7 8 7 8 9 9 0 0 1 1 2 9 9 9 0 0 9 9 9 0 0 0 - - - - - - v - v - - - - n n n v n v n v n o o o o o u u u u u u N N N N N J J J J J J 18 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend